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OM nucleic - nucleic search, using sw model 
Run on: March 4, 2004, 06:44:49 ; 



Search time 2545 Seconds 

(without alignments) 

9504.270 Million cell updates/sec 



Title: US-09-852-100B-1 

Perfect score: 810 
Sequence : 

Scoring table: 



1 atgcatattttaaaagggtc aaacgcaattatatccataa 810 



I DENT I T Y_NUC 
Gapop 10.0 , Gapext 1.0 



Searched: 27513289 seqs, 14931090276 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



55026578 



Database 



EST: 
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em_estba : * 
em__esthum: * 
em_estin: * 
em_estmu: * 
em_estov: * 
em_estpl : * 
em_estro : * 
em_htc : * 
gb_estl : * 
gb_est2 : * 
gb_htc: * 
gb_est3 : * 
gb_est4 : * 
gb_est5 : * 
em_estfun : * 
em__es torn: * 
— em-gss— hum:*- 
em gss_inv:* 
em_gss_pln: * 
em_gss_vrt : * 
em_gss_fun: * 
em_gss_mam: * 
em_gs s__mus : * 
em_gss_pro : * 
em_gss_rod: * 
em_gss_phg: * 
em_gss_vrl : * 



28: gb_gssl:* 
29: gb_gss2:* 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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RESULT 1 
BG702759 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



BG702759 678 bp mRNA linear EST 07-MAY-2001 

602684629F1 NIH_MGC_95 Homo sapiens cDNA clone IMAGE : 4817358 5', 
mRNA sequence. 
BG702759 

BG702759.1 GI: 13974418 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1 to 678) 

NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection (MGC) 

Unpublished (1999) 

Contact: Robert Strausberg, Ph.D. 

Email: cgapbs-r@mail.nih.gov 

Tissue Procurement: Miklos Palkovits, M.D., Ph.D. 

cDNA Library Preparation: Michael J. Brownstein (NHGRI ) , Shiraki 
Toshiyuki and Piero Carninci (RIKEN) 

cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 

DNA Sequencing by: Incyte Genomics, Inc. 

Clone distribution: MGC clone distribution information can be 
found through the I.M.A.G.E. Consortium/LLNL at: 
http : // image . llnl . gov 
Plate: LLAM10717 row: i column: 07 
High quality sequence stop: 678. 

Location/Qualifiers 

1. .678 

/organism="Homo sapiens" 

/mol__type="mRNA M 

/db_xref="taxon: 9606" 

/ cl one= " IMAGE :4817358" 

/tissue_type="hippocampus" 

/lab_host="DH10B" 

/clone_lib="NIH_MGC_95" 

/note="Organ: brain; Vector: pBluescriptR (modified 
pBluescript KS+) ; Site_l: BamHI; Site_2: Sall-Xhol 
(gtcgag) ; Oligo-dT primed using primer 
5 ? -TTTTTTTTTTTTTTTTVN-3 1 , size-selected for average 
insert size 2.5 kb and normalized to ROT 5. This is a 



primary library enriched for full-length clones and 
constructed using the Cap-trapper method (Carninci, in 
preparation) . Library constructed by M. Brownstein 
(NIMH/NHGRI, National Institutes of Health). Note: this 
is a NIH_MGC Library." 



ORIGIN 



Query Match 83.1%; Score 673; DB 12; Length 678; 

Best Local Similarity 100.0%; Pred. No. 4.7e-175; 

Matches 673; Conservative 0; Mismatches 0; Indels 0; Gaps 



0; 



Qy 135 GCTC CT GGGCGGAGGC GGAAGC GGAAGT GGCGAGAAAGT GT CGGT CT C CAAGAT GGC GGC 194 

I I I I I I I I M I I I I 1 I I II I I I M I I I I I I I M I I I I I I I I I I I I I M I I I I I I I I I 1 I I 

Db 6 GCTCCTGGGCGGAGGCGGAAGCGGAAGTGGCGAGAAAGTGTCGGTCTCCAAGATGGCGGC 65 

Qy 195 CGCCTGGCCGTCTGGTCCGTCTGCTCCGGAGGCCGTGACGGCCAGACTCGTTGGTGTCCT 254 

II II I I II II II I 1 I I I I I M I I I I I I I I I I I II II II I I I I II I I II I I I I I I I I I M I 

Db 66 CGCCTGGCCGTCTGGTCCGTCTGCTCCGGAGGCCGTGACGGCCAGACTCGTTGGTGTCCT 125 

Qy 255 GTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCTGTTGCCACCTCCGCCGGGGGCGA 314 

I I I I I I I I I 1 I II I I I II I I I I I I II II I I I I I I I I I I II I M I I I I I I I I I II II II I I 
Db 126 GT GGTT CGT CT CAGT CACT ACAGGACCCT GGGGGGCT GTT GCCACCT CCGCCGGGGGCGA 185 

Qy 315 GGAGT C GCT T AAGT GC GAGGAC CT CAAAGT GGGACAATAT ATT T GTAAAGAT CCAAAAAT 374 

I II I I II I I I II I I I I I I I I I I I I II II II I I I II I II I I I I I I I I I I I I I I I I I I I I I I 
Db 186 GGAGT C GCTTAAGTGC GAGGAC CT CAAAGT GGGACAATAT ATT T GTAAAGAT CCAAAAAT 245 

Qy 375 AAAT GAC GCT ACGCAAGAAC CAGTTAACT GT ACAAACTACACAGCTCAT GTTTC CT GTTT 434 

I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I II I I I I I I I I I I I I I II I II I 

Db . 246 AAAT GAC GCTACGCAAGAAC CAGTTAACTGTACAAACTACACAGCTCAT GTTTC CT GTTT 305 

Qy 435 TC CAGCAC CCAACATAACTT GTAAG GATTCCAGTGGCAATGAAACACATTTTACTGGGAA 494 

II II I I I I II I I II I I I I M I I II I I I I II I I I I I I I I I M II II I I I M I I I II I I I I I 

Db 306 TCCAGCACCCAACATAACTTGTAAGGATTCCAGTGGCAATGAAACACATTTTACTGGGAA 365 

Qy 495 CGAAGTTGGTTTTTTCAAGCCCATATCTTGCCGA7VATGTAAATGGCTATTCCTACAAAGT 554 

II I I I II M II I I I I I II II II I I I I M I I II I II I I M I M I I I I I I I I II M II I I I I 
Db 366 CGAAGTTGGTTTTTTCAAGCCCATATCTTGCCGAAATGTAAATGGCTATTCCTACAAAGT 425 

Qy 555 GGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGATTTTACCTTGGATA 614 

II I M II I I II I I M II I I M I I II I I I II I I I I I II I I I I I I I I I I I I I I II I I I I II I 
Db 426 GGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGATTTTACCTTGGATA 485 

Qy 615 CCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCTAAT 674 

I I I I I I I I I I I I II I I I I II I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 486 CCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCTAAT 545 

Qy 675 T GAT TT CAT T CTT AT TTCAAT GCAGAT T GTTGGACCTT CAGAT GGAAGT AGT T ACATT AT 734 

I I I I II I I I I I I I I I I ! I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 

Db 546 T GAT TT CAT T CTT AT TTCAAT GCAGAT T GTTGGACCT T CAGAT GGAAGT AGT T ACATT AT 605 

Qy 735 AGATTACTATGGAACCAGACTTACAAGACTGAGTATTACTAATGAAACATTTAGAAAAAC 794 

I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I 
Db 606 AGATTACTATGGAACCAGACTTACAAGACTGAGTATTACTAATGAAACATTTAGAAAAAC 665 

Qy 795 GCAATT AT AT C CA 807 

ITITITIiiiiii 



Db 666 GCAATT AT AT C CA 678 



RESULT 2 
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BG723403 

602694073F1 NIH_MGC_97 
itiRNA sequence* 
BG723403 

BG723403.1 GI:14002590 



836 bp mRNA linear EST 08-MAY-2001 
Homo sapiens cDNA clone IMAGE: 4826035 5* , 



KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo* 

1 (bases 1 to 836) 

NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection (MGC) 

Unpublished (1999) 

Contact: Robert Strausberg, Ph.D. 

Email: cgapbs-r@mail.nih.gov 

Tissue Procurement: Miklos Palkovits, M.D., Ph.D. 

cDNA Library Preparation: Michael J. Browns tein (NHGRI) , Shiraki 
Toshiyuki and Piero Carninci (RIKEN) 

cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 

DNA Sequencing by: Incyte Genomics, Inc. 

Clone distribution: MGC clone distribution information can be 
found through the I.M.A.G.E. Consortium/ LLNL at: 
http: //image. llnl .gov 
Plate: LLAM10740 row: b column: 20 
High quality sequence stop: 760. 

Location/ Qualifiers 

1. .836 

/organism="Homo sapiens" 

/mol_type= M mRNA" 

/db_xref="taxon: 9606" 

/ clone=" IMAGE : 4826035 " 

/lab_host="DH10B" 

/ cl one_l ib= "NI H_MGC_9 7 " 

/note="0rgan: testis; Vector: pBluescriptR (modified 
pBluescript KS+) ; Site_l: BamHI; Site_2: Sall-Xhol 
(gtcgag) ; Oligo-dT primed using primer 
5 i -TTTTTTTTTTTTTTTTVN-3 1 , size-selected for average 
insert size 2.2 kb and normalized to ROT 5. This is a 
primary library enriched for full-length clones and 
constructed using the Cap-trapper method (Carninci, in 
preparation) . Library constructed by M. Brownstein 
(NIMH/NHGRI, National Institutes of Health) . Note: this is 
a NIH MGC Library." 



ORIGIN 



Query Match 79.5%; Score 643.8; DB 12; 

Best Local Similarity 99.7%; Pred. No. 6.2e-l67; 
Matches 645; Conservative 0; Mismatches 2; Indels 



QY 



Length 836; 

0; Gaps 0; 

164 GCGAGAAAGTGTCGGTCTCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGG 223 



-P|- 



t J l J— I I J I I t I i l l— I I J I J l— I 11 I I I I I I I I M-U-LI-I-I-I-U-LU_LLLU_L 



Db 

Qy 

Db 



2 GCGGAAAAGTGTCGGTCTCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGG 61 

224 AGGCCGTGACGGCCAGACTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCT 283 

I I I I I I 1 I I I II I I II I I I I I I II I I I M I I I I I I I I I M I I I I I I I I I M I I I M I I I I 

62 AGGCCGTGACGGCCAGACTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCT 121 



Qy 284 GGGGGGCTGTTGCCACCTCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCAAAG 34 3 

I I I II I I I I I I I I I I II I I M I I I I I I II I II I I M I I I I I I I I I I II I I I i I I I I I I I I 
Db 122 GGGGGGCTGTTGCCACCTCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCAAAG 181 



Qy 

Db 



344 
182 



TGGGACAATATATTTGTAAAGATCCAAAAATAAATGACGCTACGCAAGAACCAGTTAACT 4 03 

I I I I I ! I I I I 1 II I I I I I I I I I I I 1 II I I I I I I I !! I I I I I ! I I I 1 I I I I I I I I I I I I ! I 

T GGGACAAT AT AT TT GTAAAGAT C CAAAAATAAAT GAC GCT AC GCAAGAACCAGT TAACT 241 



Qy 404 GT ACAAACT ACACAGCT CAT GT TT CCT GT TT T C CAGCACC CAACATAACT T GT AAGGATT 463 

I I I I I I I II II II II I I II I I II I II I II I I I I II I II I I I I I I I II I II I I I I I I I I I I 
Db 242 GTACAAACTACACAG CT CAT GT TT CCT GTTTT C CAGCAC C CAACAT AACTT GT AAGGATT 301 

Qy 464 CCAGT GGCAAT GAAACACAT TT TACT GGGAACGAAGTT GGTT TTTT CAAGC CCATATCT T 523 

I I I M I M II II M I I II II I I I M I I I II II I I I I I I I I I I I I I I II I I II II I II I I I 
Db 302 CCAGTGGCAAT GAAACACATTTTACT GGGAACGAAGTT GGTTTTTT CAAGCCCATATCTT 361 

Qy 524 GCCGAAATGTAAATGGCTATTCCTACAAAGTGGCAGTCGCATTGTCTCTTTTTCTTGGAT 583 

I I I I I I I I I II I II M I I I I II II I I II II I I I I I I I I II I I I I I I I I 1 II I I I I I I II I 

Db 362 GCCGAAATGTAAATGGCTATTCCTACAAAGTGGCAGTCGCATTGTCTCTTTTTCTTGGAT 421 

Qy 584 GGTTGGGAGCAGATCGATTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTTTGCA 643 

II M M II I I I I M II I I M M M M II II I M I I I M I I I II M I II I M II I I I I I M 

Db 422 GGTTGGGAGCAGATCGATTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTTTGCA 4 81 

Qy 644 CTGTAGGGTTTTGTGGAATTGGGAGCCTAATTGATTTCATTCTTATTTCAATGCAGATTG 7 03 

I II I II I I I II I I I I I II I I I I I I I II M | I I I I I I M II M I I I I I I I II I I I I I I I II 
Db 482 CT GTAGGGTTT T GT GGAATT GGGAGCCTAATT GATT T CAT T CTT AT TT CAAT GCAGATT G 541 

Qy 704 TTGGACCTTCAGATGGAAGTAGTTACATTATAGATTACTATGGAACCAGACTTACAAGAC 763 

I I II I I I II I I I I II I I I! II II I II I I I I I M I I I i M I I I I I I I I I I I II i I I I i I I I 

Db 542 TT GGAC CTT CAGAT GGAAGTAGTTACAT TATAGAT TACTAT GGAAC CAGAC TTACAAGAC 601 

Qy 764 TGAGTATTACTj^ATGAT^ACATTTAGAAAAACGCAATTATATCCATAA 810 

I I II II I I I I I I I I I I I I I I I I II I I I M I I I I I I I I I I I II I II I I 
Db 602 T GAGT AT T ACT AAT GAAACATT T AGAAAAAC GCAAT TAT AT CCAT AA 648 
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CB996712 788 bp mRNA linear EST 01-MAY-2003 

AGENCOURT_l 3 62 7 955 NIH_MGC_148 Homo sapiens cDNA clone 
IMAGE: 30334410 5', mRNA sequence. 
CB996712 

CB996712.1 GI: 30291232 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 788) 
"NIH-MGC _ http:y-/mgcvnci-^ni-h^gov/-.^ . 



National Institutes of Health, Mammalian Gene Collection (MGC) 

Unpublished (1999) 

Contact: Robert Strausberg, Ph.D. 

Email: cgapbs-r@mail.nih.gov 

Tissue Procurement: Dr. Stefan Hansson 
cDNA Library Preparation: Michael J. Brownstein (NHGRI) with help 

and advice from Piero Carninci (RIKEN) 
cDNA Library Arrayed by: The I.M.A.G.E- Consortium (LLNL) 
DNA Sequencing by: Agencourt Bioscience Corporation 
Clone distribution: MGC clone distribution information can be 



FEATURES 

source 



found through the I.M.A.G.E. Consortium/LLNL at: 

http : //image . llnl , gov 

Plate: NDAM354 row: j column: 19 

High quality sequence stop: 566. 

Location/Qualifiers 

1. .788 

/organism="Homo sapiens" 

/mol__type= "mRNA" 

/db_xref="taxon:96G6" 

/clone=" IMAGE: 30334410" 

/tissue_type="pre-eclamptic placenta" 

/lab_host="DH10B TonA" 

/ clone_lib="NIH_MGC_l 48" 

/note="Organ: placenta; Vector: pBluescriptR; Site_l: 
all-XhoI; Site_2: BamH; Library is oligo-dT primed and 
directionally cloned using primer 

5 ' -TTTTTTTTTTTTTTTTVN-3 ' , size-selected for average insert 
size 2,3 kb and normalized to ROT 5. This is a primary 
library enriched for full-lenght clones and constructed 
using the Cap-trapper method (Carninci, in preparation) . 
Library constructed by M. Brownstein (NIMH/NHGRI, 
National Institutes of Health) . Note: this is a NIH_MGC 
Library. " 



ORIGIN 



Query Match 79.4%; 
Best Local Similarity 99.1%; 
Matches 644; Conservative 



Score 642.8; DB 14; 
Pred. No. l.le-166; 
0; Mismatches 2; 



Length 788; 



Indels 



0; Gaps 



0; 



Qy 



Db 



165 CGAGAAAGTGTCGGTCTCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGA 224 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I M I I I I I I 
24 CCAGAAAGTGTCGGTCTCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGA 83 



Qy 

Db 

Qy 

Db 



225 GGCCGTGACGGCCAGACTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTG 284 

I I I II I I I I I I I I M I I I I I I I II I I I I I I I I II I I I I I I I II I I I I I I I I M I I II II I 
84 GGCCGTGACGGCCAGACTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTG 143 

285 GGGGGCTGTTGCCACCTCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCAAAGT 344 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 11 I 1 I I I I I I I I I I I I I II I I I 1 I I I I I I 
144 GGGGGCTGTTGCCACCTCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCAAAGT 203 



Qy 

Db 

Qy 



Db 

Qy 

Db 



345 GGGACAAT AT AT T T GTAAAGAT CCAAAAAT AAAT GACGCT ACGCAAGAAC CAGTT AACT G 404 

I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 
204 GGGACAAT AT ATTTGTAAAGAT CCAAAAATAAAT GACGCTACGCAAGAAC CAGTTAACT G 263 



405 



T A C AAAC T AC AC AG C T CAT GTTTCCTGTTTTC C AG C AC C C AAC AT AAC T T GT AAG GAT T C 

TrrrrrrM"i"i"Ni - i-i-H-H-H-H^ 



464 



264 T ACAAACT ACACAGCT CAT GTTTC CT GT TT TCCAGCAC CCAACATAACTT GT AAGGAT T C 323 

465 CAGT GGCAAT GAAACACATTTTACTGGGAACGAAGTT GGTTTTTT CAAGCCCATAT CTT G 524 

I I I I I II I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 
324 CAGT GGCAAT GAAACACATTTTACTGGGAACGAAGTT GGTTTTTT CAAGCCCATAT CTT G 383 



Qy 

Db 



525 CCGAAATGTAAATGGCTATTCCTACAAAGTGGCAGTCGCATTGTCTCTTTTTCTTGGATG 584 

II I I I II I I I I I I I I I I I I I M I II I I I II I I II I II I II II I I I I I I I I I I I I I I M M 
384 CCGAAATGTAAATGGCTATTCCTACAAAGTGGCAGTCGCATTGTCTCTTTTTCTTGGATG 44 3 



Qy 585 GTTGGGAGCAGATCGATTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTTTGCAC 644 

I I I M I I I I M I I I M I II II I I I I I I I I I I I I 1 I II I I I I I I II I I I M I I I I I I M I I 
Db 444 GTTGGGAGCAGATCGATTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTTTGCAC 503 

Qy 645 T GT AGGGT T TT GT GGAAT T GGGAGC CTAATT GAT TT CATT CTT ATTT CAAT GCAGATT GT 7 04 

I I I I I M I I I I I I I I I II I I I II I I I I I I I I I I I I I 1 I II I I I I I I I II I II I I I I I I M 
Db 504 T GT AGGGT T T T GT GGAAT T GGGAGC CTAATT GATTT CATT CTT ATTT CAAT GCAGATT GT 563 

Qy 705 TGGACCTTCAGATGGAAGTAGTTACATTATAGATTACTATGGAACCAGACTTACAAGACT 764 

I I I I I II I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I I II I I I I I I I I II I I II I I I 
Db 564 TGGACCTTCAGATGGAAGTAGTTACATTATAGATTACTATGGAACCAGACTTACAAGACT 623 

Qy 765 GAGT AT T ACT AAT GAAACAT T T AGAAAAAC GC AAT TAT AT C CAT AA 810 

I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I 
Db 624 GAGT AT TACT AAT GAAACAT T T AGAAAAAC GCAAT T GT AT C CAT AA 669 



RESULT 4 
BG709182 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 



source 



BG709182 658 bp mRNA linear EST 07-MAY-2001 

602675061F1 NIH_MGC_96 Homo sapiens cDNA clone IMAGE: 4797782 5', 
mRNA sequence. 
BG709182 

BG709182.1 GI: 13987263 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chorda ta; Craniata; Vertebrata; Euteleos tomi ; 

Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1 to 658) 

NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection (MGC) 
Unpublished (1999) 
Contact: Robert Strausberg, Ph.D. 
Email: cgapbs-r@mail.nih.gov 
Tissue Procurement: Miklos Palkovits, 

cDNA Library Preparation: Michael J. 
Toshiyuki and Piero Carninci (RIKEN) 

cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 

DNA Sequencing by: Incyte Genomics, Inc. 

Clone distribution: MGC clone distribution information can be 
found through the I.M.A.G.E. Consortium/ LLNL at: 
http : //image . llnl . gov 
Plate: LLAM10684 row: i column: 15 
High quality sequence stop: 658. 

Location/Qualifiers 
It 



M.D., Ph.D. 

Brownstein (NHGRI ) , Shiraki 



r658- 



/organism="Homo sapiens" 

/mol__type= M mRNA" 

/db_xref="taxon: 9606" 

/clone="IMAGE: 4797782" 

/tissue_type="hypothalamus" 

/lab_host="DHl0B" 

/clone_lib="NIH_MGC_96" 

/note="Organ: brain; Vector: pBluescriptR (modified 
pBluescript KS+) ; Site_l : BamHI; Site_2: Sall-Xhol 
(gtcgag) ; Oligo-dT primed using primer 



5 1 -TTTTTTTTTTTTTTTTVN-3 ' , size-selected for average 
insert size 2.3 kb and normalized to ROT 5. This is a 
primary library enriched for full-length clones and 
constructed using the Cap-trapper method (Carninci f in 
preparation) . Library constructed by M. Brownstein 
(NIMH/NHGRI, National Institutes of Health) . Note: this is 
a NIH_MGC Library." 



ORIGIN 



Query Match 78.6%; Score 636.4; DB 12; Length 658; 

Best Local Similarity 99.7%; Pred. No. 6.4e-165; 

Matches 637; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 

Qy 172 GTGTCGGTCTCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGAGGCCGTG 231 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 
Db 4 GGGTCGGTCTCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGAGGCCGTG 63 

Qy 232 ACGGCCAGACTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCT 291 

I II M I II I I I I I I I I M I M M I I I I I I I I I I I I I I I I I I I M I M I I II I M I I II M 

Db 64 ACGGCCAGACTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCT 123 

Qy 292 GTTGCCACCTCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCAAAGTGGGACAA 351 

I II I I I I I I I I I II I II I II I I II I I II II I I I I I I I I I I I I II I I II I II I I I II I I I I 
Db 124 GTTGCCACCTCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCAAAGTGGGACAA 183 

Qy 352 TATATTTGTAAAGAT CCAAAAATAAATGAC GCTACGCAAGAAC CAGTTAACTGTACAAAC 411 

I I I I II I I I I II I I M I I I M II II II I I I II M I II M I I I M I I I M I I I I M I II I I 

Db 184 TAT ATTT GT AAAGAT C C AAAAAT AAAT GAC GC TAC GC AAGAAC CAGT T AACT GTACAAAC 243 

Qy 412 TACACAGCTCATGTTTCCT GTTTT CCAGCACCCAACATAACTT GTAAGGATT CCAGT GGC 471 

II M I I I I I II I I II I I II I II I I II I II II I I II I II I I I I I I II I I I I I I I I I I I I I I 

Db 244 TACACAGCTCATGTTTCCT GTTTTCCAGCACCCAACATAACTT GTAAGGATT CCAGT GGC 303 

Qy 472 AATGAAACACATT TTACT GGGAAC GAAGTT GGTTTTT T CAAGC CCAT AT CTT GCC GAAAT 531 

I II II II I M I I I I I II II I I I I II I I I I I I I I M I I I I I I I I I I I II I I I I I I II I I I I 

Db 304 AATGAAACACATTTTACTGGGAACGAAGTTGGTTTTTTCAAGCCCATATCTTGCCGAAAT 363 

Qy 532 GTAAATGGCTATTCCTACAAAGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGA 591 

II I I I I I I I II I I I I I I I I I 11 I I I I I 11 M I I I I I I I I I I I I I M I I I I I I I I I I I I I I 

Db 364 GT AAAT GGCT ATT CCTACAAAGTGGCAGT C GCATT GT CT CTT T T T CTT GGAT GGT TGGGA 423 

Qy 592 GCAGATCGATTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGG 651 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I 
Db 424 GCAGATCGATTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGG 483 

Qy 652 T T TT GT GGAATT GGGAGC CTAATT GATT T CAT T CT T ATT T CAAT GCAGATT GTT GGAC C T 711 

r|-|-|-|-|-|-|-|ii-|i-|i-h|-|-H-|-|-|-H-|-H-H-l-l-H I l-l-l-l-l-l-l-UU-U-U-LLLLLLLLIJ 



Db 484 TTTNGTGGAATTGGGAGCCTAATTGATTTCATTCTTATTTCAATGCAGATTGTTGGACCT 543 

Qy 712 T CAGAT GGAAGTAGTTACATTATAGATTACTAT GGAACCAGACTTACAAGACTGAGTATT 771 

I I I I I I I 1 II I I I I I I I I M I I II I I I I II I I I 1 II I I I II I 1 I I I I I I I II II I I 1 I I I 

Db 544 TCAGATGGAAGTAGTTACATTATAGATTACT AT GGAACCAGACTTACAAGACT GAGTATT 603 



Qy 

Db 



772 AC T AAT G AAAC AT T T AG AAAAAC G CAAT TAT AT C C AT AA 810 

I I I M I II I II I I I II I I I M I I I I I II I I II I I I II II 
604 ACTAAT G7\AACATTTAGAAAAACG CAAT TATATC CAT AA 642 



RESULT 5 
BC048995 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



REMARK 
COMMENT 



GI:28981340 



BC048995 982 bp mRNA linear HTC 17-DEC-2003 

Homo sapiens beta-amyloid binding protein precursor, mRNA (cDNA 
clone IMAGE:5261702) . 
BC048995 
BC048995.1 
HTC. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Eu t el eo storm; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 982) 

Strausberg,R.L., Feingold, E. A. , Grouse, L.H., Derge,J.G., 
Klausner,R.D., Collins , F. S . , Wagner,L., Shenmen, CM. , Schuler, G. D. , 
Altschul,S.F., Zeeberg,B., Buetow,K.H., Schaef er, C. F. , Bhat,N.K., 
Hopkins, R.F., Jordan, H., Moore, T., Max,S.I., Wang, J., Hsieh,F., 
Diatchenko,L., Marusina,K., Farmer, A.A., Rubin, G.M., Hong,L., 
Stapleton,M. , Soares,M.B., Bonaldo,M. F. , Casavant,T . L. , 
Scheetz,T.E., Brownstein,M. J. , Usdin,T.B., Toshiyuki, S . , 
Carninci,P., Prange,C, Raha,S.S., Loquellano, N. A. , Peters, G. J., 
Abramson,R.D., Mullahy, S . J. , Bosak,S.A., McEwan,P.J., 
McKernan, K . J . , Malek,J.A., Gunaratne, P . H. , Richards, S., 
Worley,K.C, Hale,S., Garcia,A.M., Gay,L.J., Hulyk,S.W., 
Villalon,D.K., Muzny, D.M. , Sodergren, E . J. , Lu,X., Gibbs , R. A. , 
Fahey,J., Helton, E., Ketteman,M. , Madan,A., Rodrigues , S . , 
Sanchez, A. , Whiting,M. , Madan,A. , Young, A. C, Shevchenko, Y. , 
Bouffard,G.G., Blakesley, R.W. , Touchman, J . W. , Green, E.D., 
Dickson, M.C., Rodriguez, A. C . , Grimwood, J. , Schmutz,J., Myers, R.M., 
Butter field,Y.S., Krzywinski,M. I . , Skalska,U., Smailus , D. E . , 
Schnerch,A., Schein,J.E., Jones, S.J. and Marra,M.A. 
Generation and initial analysis of more than 15,000 full-length 
human and mouse cDNA sequences /onAOX 
Proc. Natl. Acad. Sci. U.S.A. 99 (26), 16899-16903 (2002) 
12477932 

2 (bases 1 to 982) 
Strausberg, R. 
Direct Submission 

Submitted (17-MAR-2003) National Institutes of Health, Mammalian 
Gene Collection (MGC) , Cancer Genomics Office, National Cancer 
Institute, 31 Center Drive, Room 11A03, Bethesda, MD 20892-2590, 
USA 

NIH-MGC Project URL: http://mgc.nci.nih.gov 
Contact: MGC help desk 
Email : cgapbs-r@mail . nih . gov 
— Ti-ssue-Procurement:-Mi-k-l-os-^^ 



Shiraki 



cDNA Library Preparation: Michael J. Brownstein (NHGRI) & 
Toshiyuki and Piero Carninci (RIKEN) 

cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 
DNA Sequencing by: Institute for Systems Biology 
http : //www. systemsbiology . org 
contact: amadan@systemsbiology.org 

Anup Madan, Jessica Fahey, Erin Helton, Mark Ketteman, Anuradha 
Madan, Stephanie Rodrigues, Amy Sanchez and Michelle Whiting 

Clone distribution: MGC clone distribution information can be found 



through the I.M.A.G.E. Consortium/LLNL at: http://image.llnl.gov 
Series: IRAK Plate: 106 Row: h Column: 9 

This clone was selected for full length sequencing because it 
passed the following selection criteria: matched mRNA gi : 7019328 
This clone has the following problem: no 5' EST match. 
FEATURES Location/Qualifiers 
source 1. .982 

/organism="Homo sapiens" 

/mol_type="mRNA" 

/db_xref="taxon:9606" 

/clone=" IMAGE: 5261702" 

/tissue_type="Brain / hippocampus" 

/clone_lib="NIH_MGC_95" 

/lab_host="DH10B" 

/note= "Vector : pBluescript" 

ORIGIN 

Query Match 78.4%; Score 635; DB 11; Length 982; 

Best Local Similarity 99.2%; Pred. No. 1.8e-164; 

Matches 638; Conservative 0; Mismatches 5; Indels 0; Gaps 0; 
Qy 168 GAAAGTGTCGGTCTCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGAGGC 227 

I M I I M I M I I I I I II I Ml M II I I I M I I I I M I I I I M M I II I M M II I I I M 

Db 1 GTAAGTGTCGGTCTCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGAGGC 60 

Qy 228 CGTGACGGCCAGACTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGG 287 

M II II I II I I II II I I I II II M II I II I I I I I I I II I M I II I II II II II I M II II 
Db 61 C GT GAC GGC CAGACT CGT T GGT GT CCT GTGGT T CGT CT CAGT CACTACAGGAC CCT GGGG 120 

Qy 28 8 GGCTGTTGCCACCTCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCAAAGTGGG 347 

I I I I I II I I I II I I II II II II I I II II I I II I II I I I I II I I II II I I I I II I I II II I 
Db 121 GGCTGTTGCCACCTCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCAAAGTGGG 180 

Qy 348 ACAATATATTTGTAAAGAT CCAAAAATAAAT GACGCTAC GCAAGAACCAGTTAACT GTAC 407 

I II I II I I I II I I I I I I I I ! I I I I I I I I I I II I I I II I II I I I I I I I I I I I II I I I I I I I 
Db 181 AC AAT AT AT T T GTAAAG AT CCAAAAATAAAT GACGCT ACGCAAGAACCAGTTAACTGTAC 240 

Qy 408 AAACTACACAGCTCATGTTTCCTGTTTTCCAGCACCCAACATAACTTGTAAGGATTCCAG 467 

I I I I I I I I I I I I I I I I I I I I I I II I II I I II I I I I I I II I M II I II I I M II I I M I 
Db 241 AAACTATACAGCTCATGTTTCCTGTTTTCCAGCACACAACATAACTTGTAAGGATTCCAG 300 

Qy 468 TGGCAATGAAACACATTTTACTGGGAACGAAGTTGGTTTTTTCAAGCCCATATCTTGCCG 527 

M M I II II I II I II II I II II II II I I II I II II I II I II I II II M II I II I I I I I I I 
Db 301 TGGCAATGAAACACATTTTACTGGGAACGAAGTTGGTTTTTTCAAGCCCATATCTTGCCG 360 

Qy 528 AAATGTAAATGGCTATTCCTACAAAGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTT 587 

-|-l11-|i-|1-|-h|-H-|-|H-|-|-l-H-H-l-M-l-l-l-M-l-l— l-l-l-M-l-l-l-l-l-M I LU-LI-LLLLIJJJ 



Db 3 61 AAAT GT AAAT GGCTATTCCT ACAAAGT GGCAGTAGCATT GT CT CTTTTTCTT GGAT GGTT 420 

Qy 588 GGGAGCAGATCGATTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGT 647 

I I I II I I I I I II II I I I II I II I I II II I II M I I I II I I II I I II I I I II I M I II M I 

Db 421 GGGAGCAGATCGATTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGT 480 

Qy 648 AGGGT T TT GT GGAAT T GGGAGCCTAAT T GAT TT CATT CT T ATT T CAAT GCAGATT GTT GG 707 

II I I II I I I I I I I I I I II I I II I M I I II I II II I 1 M II M II I I I I II I I M II II I I 

Db 481 AGGGT T TT GT GGAATTG GGAGCCT AATT GAT T TCAT T CTT AT T T CAAT GCAGATT GT T GG 540 



Qy 708 AC CTT CAGAT GGAAGTAGT T ACAT T AT AGAT TACT AT GGAACCAGACTTACAAGACT GAG 767 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I 
Db 541 ACCTT CAGAT GGAAGTAGT T ACAT TAT AGAT TACT AAGGAACCAGACTTACAAGACTGAG 600 

Qy 768 TATTACTAAT GAAACATTTAGAAAAACGCAATTATAT CCATAA 810 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I 
Db 601 TAT T ACT AAT GAAACAT T TAGAAAAAC GCAATTAT AT CCATAA 643 



RESULT 6 
BI458114 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



M.D., Ph.D. 

Brownstein (NHGRI) , Shiraki 



BI458114 750 bp mRNA linear EST 21-AUG-2001 

603198535F1 NIH_MGC_96 Homo sapiens cDNA clone IMAGE: 5278064 5', 
mRNA sequence. 
BI458114 

BI458114. 1 GI: 15248770 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1 to 750) 

NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection (MGC) 
Unpublished (1999) 
Contact: Robert Strausberg, Ph.D. 
Email: cgapbs-r@mail.nih.gov 
Tissue Procurement: Miklos Palkovits, 

cDNA Library Preparation: Michael J. 
Toshiyuki and Piero Carninci (RIKEN) 

cDNA Library Arrayed by: The I.M.A-G.E. Consortium (LLNL) 

DNA Sequencing by: Incyte Genomics, Inc. 

Clone distribution: MGC clone distribution information can be 
found through the I.M.A.G.E. Consortium/LLNL at: 
http : / /image . llnl . gov 
Plate: LLAM11702 row: e column: 09 
High quality sequence stop: 711. 

Location/Qualifiers 

1. .750 

/organism="Homo sapiens" 

/mol_type= M mRNA" 

/db_xref="taxon: 9606" 

/ clone=" IMAGE : 52 7 8 0 64 " 

/ tissue_type= "hypothalamus " 

/lab_host="DH10B H 

/clone_lib="NIH_MGC_96" 

/no t e= " O r g an-:— bx-ai-n-;— Ve e-t o -r-:— p B lu e s c r i p t R— ( mo di f i e d_ 

pBluescript KS+) ; Site_l: BamHI; Site_2: Sall-Xhol 

(gtcgag) ; Oligo-dT primed using primer 
5 i -TTTTTTTTTTTTTTTTVN-3 1 , size-selected for average 
insert size 2.3 kb and normalized to ROT 5. This is a 
primary library enriched for full-length clones and 
constructed using the Cap-trapper method (Carninci, in 
preparation) . Library constructed by M. Brownstein 

(NIMH/NHGRI, National Institutes of Health) . Note: this is 
a NIH_MGC Library." 



ORIGIN 



Query Match 78.0%; Score 631.6; DB 12; Length 750; 

Best Local Similarity 99,1%; Pred. No. 1.4e-163; 

Matches 645; Conservative 0; Mismatches 5; Indels 1; Gaps 



1; 



Qy 161 GTGGCGAGAAAGTGTCGGTCTCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTC 220 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I (I I I I I I I 
Db 4 GGGGGGAGAAAGTGTCGGTCTCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTC 63 

Qy 221 CGGAGGCCGTGACGGCCAGACTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGAC 280 

I I I I II I I I I II I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 64 CGGAGGCCGTGACGGCCAGACTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGAC 123 

Qy 281 CCTGGGGGGCTGTTGCCACCTCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCA 340 

I I I I I I I I II I I I I I I I II I II II I I I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I 
Db 124 CCTGGGGGGCTGTTGCCACCTCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCA 183 

Qy 341 AAGT GGGACAAT AT ATTT GT AAAGAT CCAAAAATAAAT GACGCTAC GCAAGAAC CAGT TA 400 

I II I I I I M I II I I I II M I I I I I I I II I I I I I I I I I I II I II I I I I I M I I I I M M I I 
Db 184 AAGT GGGACAAT AT ATTT GT AAAGAT C CAAAAATAAAT GAC GCTAC GCAAGAAC CAGT T A 243 

Qy 401 ACT GTACAAACT ACACAGCT CAT GT T T CCT GT TT T C CAGCAC C CAACAT AACT T GTAAGG 460 

I I II I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I I I II I 
Db 244 ACT GTACAAAC T ACACAGCT CAT GTTT CAT GTTT T C CAGCAC C CAACATAACT T GTAAGG 303 

Qy 461 ATT C CAGT GGCAATGAAACACATTTTACT GGGAAC GAAGTTGGTTTTTTCAAGC C CATAT 520 

I I I II I I I I I I I I I I I I I I M I I I I I I I I I I I II I I I I I I I I I I I I II I I I II I I I I II I 

Db 304 ATT C CAGT GG CAATGAAACACATTT TACT GGGAAC GAAGTTGGTT TT T T CAAGC C CATAT 363 

Qy 521 CTTGCCGAAATGTAAATGGCTATTCCTACAAAGTGGCAGTCGCATTGTCTCTTTTTCTTG 580 

II I I I I II II I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 

Db 364 CTTGCCGAAATGTAAATGGCTATTCCTACAAAGTGGCAGTCGCATTGTCTCTTTTTCTTG 423 

Qy 581 GATGGTTGGGAGCAGATCGATTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTTT 640 

I I I II I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 
Db 424 GATGGTTGGGAGCAGATCGATTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTTT 483 

Qy 641 GCACTGTAGGGTTTTGTGGAATTGGGAGCCTAATTGATTTCATTCTTATTTCAATGCAGA 700 

I I I II I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I M 

Db 484 GCACTGTAGGGTTTTGTGGAATTGGGAGCCTAATTGATTTCATTCTTATTTCAATGCAGA 543 

Qy 701 T T GT T GGAC CT T CAGAT GGAAGT AGT T ACA- T T AT AGATTACTAT GGAAC CAGACT T ACA 759 

II I I I I I I II I I I M I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I II I I I I I I I I 

Db 544 TT GTT GGAC CTT CAGAT GGAAGT AGT TACATTTAT AGATTACTAT GGAAC CAGACT T ACA 603 

Qy 760 AGACT GAGT AT T ACT AAT GAAAC AT TT AGAAAAAC GC AAT TAT AT C C AT AA 810 

7" l"l"li"l"l"hl"l _ h|-|-|-|-|-|-|-|-|-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-W-l-l-l-l 1 



Db 604 AGACT GAGT AT TACT AAT GAAACAT TT AGAAAAACGCAATTATAT C CANT A 654 



RESULT 7 
BI546941 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 



BI546941 975 bp mRNA linear EST 05-SEP-2001 

603190155F1 NIH_MGC_95 Homo sapiens cDNA clone IMAGE : 5261748 5 f , 
mRNA sequence. 
BI546941 

BI546941.1 GI:15434253 



KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 975) . 
NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection (MGC) 

Unpublished (1999) 

Contact: Robert Strausberg, Ph.D. 

Email: cgapbs-r@mail.nih.gov 

Tissue Procurement: Miklos Palkovits, M.D., Ph.D. 

cDNA Library Preparation: Michael J. Browns tein (NHGRI) , Shiraki 
Toshiyuki and Piero Carninci (RIKEN) 

cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 

DNA Sequencing by: Incyte Genomics, Inc. 

Clone distribution: MGC clone distribution information can be 
found through the I.M.A.G.E. Consortium/LLNL at: 
http : / /image . llnl . gov 
Plate: LLAM11659 row; m column: 13 
High quality sequence stop: 766. 

Location/Qualif iers 

1. .975 

/organism="Homo sapiens" 

/mol_type="mRNA" 

/db_xref="taxon: 9606" 

/ clone=" IMAGE : 526174 8 " 

/tissue_type-"hippocampus" 

/lab_host="DH10B" 

/clone_lib="NIH_MGC_95 " 

/note="0rgan; brain; Vector: pBluescriptR (modified 
pBluescript KS+) ; Site_l: BamHI; Site_2: Sall-Xhol 
(gtcgag) ; Oligo-dT primed using primer 
5 r -TTTTTTTTTTTTTTTTVN- 3 ' , size-selected for average 
insert size 2.5 kb and normalized to ROT 5. This is a 
primary library enriched for full-length clones and 
constructed using the Cap-trapper method (Carninci, in 
preparation) . Library constructed by M. Brownstein 
(NIMH/NHGRI, National Institutes of Health) . Note: this 
is a NIH_MGC Library." 



ORIGIN 



Query Match 77.3%; 
Best Local Similarity 98.6%; 
Matches 642; Conservative 



Score 626.2; DB 12; 
Pred. No. 4.9e-162; 
0; Mismatches 8; 



Length 975; 



Indels 



1; Gaps 



QY 



Db 



~r60 _ AGTGGCGAGAAAGTGTeGGTeTeeAAGATGGGGGCCGCCTGGCCGTCTGGT.CCGTCTCCT_^l^ 
II I III I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I II I II I II I I I I I M 
1 AGCGTGGAGTAAGTGTCGGTCTCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCT 60 



Qy 



Db 



220 CCGGAGGCCGTGACGGCCAGACTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGA 279 

I I I I I I I I II I I I II II I II I I I II I I M I II I I II I I I I I I I II I I I I I I II I I I M M 

61 CCGGAGGCCGTGACGGCCAGACTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGA 120 



Qy 



Db 



280 CCCTGGGGGGCTGTTGCCACCTCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTC 339 

I I II I I II I I I I I I II I I I I I I I I I I I II I II I I I I I I I I I I I I I I II M I II I I I II I I 

121 CCCTGGGGGGCTGTTGCCACCTCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTC 180 



Qy 

Db 



340 AAAGTGGGACAATATATTT GTAAAGATCCAAAAATAAAT GACGCTACGCAAGAACCAGTT 
I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
181 AAAGT GGGACAATAT AT T T GTAAAGATCCAAAAATAAAT GAC GCTACGCAAGAAC CAGT T 



399 
240 



Qy 400 AACTGTACAAACTACACAGCTCATGTTTCCTGTTTTCCAGCACCCAACATAACTTGTT^lAG 459 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 
Db 241 AACTGTACAAACTATACAGCTCATGTTTCCTGTTTTCCAGCACACAACATAACTTGTAAG 300 

Qy 460 GATTCCAGTGGCAATGAAACACATTTTACTGGGAACGAAGTTGGTTTTTTCAAGCCCATA 519 

I I I II I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I I I 
Db 301 GATTCCAGTGGCAATGAAACACATTTTACTGGGAACGAAGTTGGTTTTTTCAAGCCCATA 360 

Qy 520 TCTTGCCGAAATGTAAATGGCTATTCCTACAAAGTGGCAGTCGCATTGTCTCTTTTTCTT 579 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 361 TCTTGCCGAAATGTAAATGGCTATTCCTACAAAGTGGCAGTAGCATTGTCTCTTTTTCTT 42 0 

Qy 580 GGATGGTTGGGAGCAGATCGATTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTT 639 

II I I M I I I I I I I M I I M I I I II M I I I I I I M I I II I I I I I I I I I I I I I I I I I I I II I 

Db 421 GGATGGTTGGGAGCAGATCGATTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTT 480 

Qy 640 T GCACT GT AGGGTT T T GT GGAATT GGGAGCCT AATT GAT TT CAT T CT TAT TT CAAT GCAG 699 

I I II I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
Db 4 81 T GCACTGTAGGGT T T TGT GGAAT TGGGAGCCTAATT GAT T T CATT C T TAT TTCAATGCAG 54 0 

Qy 700 ATT GTTGGACCTT CAGAT GGAAGTAGTTACATTATAGATTACTAT GGAAC CAGACTTACA 759 

I I II I I II I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I II I I II I I I I I I I I II I I I 
Db 541 ATT GTTGGACCTT CAGAT GGAAGTAGTTACATTATAGATTACTAAGGAACCAGACTTACA 600 

Qy 7 60 AGACT GAGT ATT ACT AAT GAAAC ATT TAGAAAAAC G CAAT TAT AT CCAT AA 810 

I I I II I II I I I I I M I M ! I I I I I I I I I I II M I I I I I I I I I I I I I I II f 

Db 601 AGACT GAGT ATT ACTAAT GAAAC A- T TAGAAAAAC GCAAT TAT AT CCAT AA 650 



RESULT 8 
BI464436 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE- 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



BI464436 882 bp mRNA linear EST 21-AUG-2001 

603205310F1 NIH_MGC_97 Homo sapiens cDNA clone IMAGE : 5271098 5', 
mRNA sequence. 
BI464436 

BI464436.1 GI:15255092 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; 



Craniata; Vertebrata; Euteleostomi ; 



Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

"1 (bases— l-to-882) 



NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection (MGC) 

Unpublished (1999) 

Contact: Robert Strausberg, Ph.D. 

Email: cgapbs-r@mail.nih.gov 

Tissue Procurement: Miklos Palkovits, M.D., Ph.D. 

cDNA Library Preparation: Michael J. Brownstein (NHGRI), Shiraki 
Toshiyuki and Piero Carninci (RIKEN) 

cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 

DNA Sequencing by: Incyte Genomics, Inc. 



FEATURES 

source 



Clone distribution: MGC clone distribution information can be 
found through the I.M.A.G.E. Consortium/LLNL at: 
http : //image . llnl . gov 
Plate: LLAM11684 row: c column: 03 
High quality sequence stop: 759. 
Location/Qualif iers 
1. .882 

/organism="Homo sapiens" 
/mol_type="mRNA" 
/db_xref="taxon: 9606" 
/clones" IMAGE: 5271098" 
/lab_host="DHlOB n 
/clone_lib= f 'NIH_MGC_97" 

/note="Organ: testis; Vector: pBluescriptR (modified 
pBluescript KS+) ; Site_l: BamHI; Site_2 : Sall-Xhol 

(gtcgag) ; Oligo-dT primed using primer 
5 1 -TTTTTTTTTTTTTTTTVN-3 1 , size-selected for average 
insert size 2.2 kb and normalized to ROT 5. This is a 
primary library enriched for full-length clones and 
constructed using the Cap-trapper method (Carninci, in 
preparation) . Library constructed by M. Brownstein 

(NIMH/NHGRI, National Institutes of Health). Note: this is 
a NIH MGC Library." 



ORIGIN 



Query Match 7 6.8%; 

Best Local Similarity 99.5%; 
Matches 645; Conservative 



Score 622.4; DB 12; 
Pred. No. 5.4e-161; 
0; Mismatches 1; 



Length 882; 
Indels 2; 



Gaps 



2; 



Qy 



Db 



164 GCGAGAAAGTGTCGGTCTCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGG 223 

! I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 1 I I I I I I I I I I I I 
4 GGGAGAAAGTGTCGGTCTCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGG 63 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 



224 AGGCCGTGACGGCCAGACTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCT 283 

I I I I I I II I I I I I I I I II II I I II I I II I M I I I I I I I I II I I I I I I I I I I I I I I M I I I 

64 AGGCCGTGACGGCCAGACTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCT 123 

284 GGGGGGCTGTTGCCACCTCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCAAAG 343 

II I I I M I I I M I I I I I I I I I I I I I I M II I I I I I I I II II I I I I I I M I I I II II M I I 

124 GGGGGGCTGTTGCCACCTCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCAAAG 183 

344 TGGGACAATATATTTGTAAAGATCCAAAAATAAATGACGCTACGCAAGAACCAGTTAACT 403 

I I I I I I I I I I I I I I I I I I II II I I I I I I II I I I I I I I II I I I I I I I I I I I II I I I I I I I I 
184 T GGGACAATAT ATTT GT AAAGAT CCAAAAATAAAT GACGCT ACGCAAGAACCAGTTAACT 243 

404 GTACAAACTACACAGCT CAT GTTTCCT GTTTTCCAGCACCCAACATAACTTGTAAGGATT 463 
1 I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I M I I I I I I I II I I I I I I II - 



Db 



244 GTACAAACTACACAGCTCATGTTTCCTGTTTTCCAGCACCCAACATAACTTGTAAGGATT 303 



Qy 

Db 

Qy 

Db 



464 CCAGTGGCAATGAAACACATTTTACTGGGAACGAAGTTGGTTTTTTCAAGCCCATATCTT 523 

I M I I II I I II M II II M M M I M I I I I I I I M I I M I II M II I I I I I I II II I II I 

304 CCAGTGGCAATGA/^ACACATTTTACTGGGAACGAAGTTGGTTTTTTCAAGCCCATATCTT 363 

524 GC C GAAATGTAAAT GGCTATTCCTACAAAGTGGCAGT CGCATT GT C T C T T TTT CT T GGAT 583 

I I I II I I M I I I I I I I I I I I I I I M I I I I II II I II I II I I I I I II II I I I M I I I I I I I 
364 GCCGAAATGTAAATGGCTATTCCTACAAAGTGGCAGTCGCATTGTCTCTTTTTCTTGGAT 423 



Qy 584 GGTTGGGAGCAGATCGATTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTTTGCA 643 

I I I I I I I I I I I I I I I M I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I 
Db 424 GGTTGGGAGCAGATCGATTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTTTGCA 483 

Qy 644 CTGTAGGGTTTTGTGGAATTGGGAGCCTAATTGATTTCATTCTTATTTCAATGCAGATTG 703 

I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I 
Db 484 CT GTAGGG- TTTGT GGAATTGGGAGCCTAATT GATTT CATTCTTATTTCAAT GCAGATTG 542 

Qy 704 TT GGACCTT CAGAT GGAAGTAGTTACATTATAGATTACTATGGAACCAGACTTACAAGAC 7 63 

I II I II I I I I I I I I II I I I I I I I I I I II I I II I I II I I I I I II II I II I I I I I I I I I I I I 
Db 543 T T GGACCTT CAGAT GGAAGTAGT T ACATTAT AGATTACT AT GGAAC CAGACT T ACAAGAC 602 

Qy 764 TGAGTATTACTAATGAAACA-TTTAGAAAAACGCAATTATATCCATAA 810 

I I I I I I I I I I II I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 603 T GAGT AT T ACT AAT GAAAC AT T T T AGAAAAAC GC AAT TAT AT C CAT AA 650 



RESULT 9 
BI462204 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 



source 



BI462204 879 bp mRNA linear EST 21-AUG-2001 

603205517F1 NIH_MGC_97 Homo sapiens cDNA clone IMAGE: 5271077 5', 
mRNA sequence. 
BI462204 

BI4 62204. 1 GI: 15252860 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chorda ta; Craniata; Vertebrata; Euteleostomi ; 

Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1 to 879) 

NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection (MGC) 

Unpublished (1999) 

Contact: Robert Strausberg, Ph.D. 

Email: cgapbs-r@mail.nih.gov 

Tissue Procurement: Miklos Palkovits, M.D., Ph.D. 

cDNA Library Preparation: Michael J. Brownstein (NHGRI), Shiraki 
Toshiyuki and Piero Carninci (RIKEN) 

cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 

DNA Sequencing by: Incyte Genomics, Inc. 

Clone distribution: MGC clone distribution information can be 
found through the I.M.A.G.E. Consortium/ LLNL at: 
http : / /image . llnl . gov 
Plate: LLAM11684 row: b column: 06 
High quality sequence stop: 753. 

Location/ Qualifiers 
1-^^8-79 — ___ 



/organism="Homo sapiens 11 

/mol_type="mRNA M 

/db_xref="taxon:9606" 

/clones "IMAGE: 5271077" 

/lab_host="DH10B" 

/clone_lib="NIH_MGC_97" 

/note="Organ: testis; Vector: pBluescriptR (modified 
pBluescript KS+) ; Site_l: BamHI; Site_2: Sall-Xhol 
(gtcgag) ; Oligo-dT primed using primer 
5 1 -TTTTTTTTTTTTTTTTVN-3 1 , size-selected for average 



insert size 2.2 kb and normalized to ROT 5. This is a 
primary library enriched for full-length clones and 
constructed using the Cap-trapper method (Carninci, in 
preparation) , Library constructed by M. Brownstein 
(NIMH/NHGRI, National Institutes of Health)- Note: this is 
a NIH_MGC Library." 



ORIGIN 



Query Match 76.7%; Score 621.6; DB 12; Length 879; 

Best Local Similarity 98.5%; Pred. No. 8.9e-161; 

Matches 638; Conservative 0; Mismatches 9; Indels 1; Gaps 1; 

Qy 164 GCGAGAAAGTGTCGGTCTCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGG 223 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 4 GGGAGAAAGTGTCGGTCTCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGG 63 

Qy 224 AGGC C GT GAC GGCCAGACT C GTTGGT GT CCT GT GGT T C GT CT CAGT CACTACAGGAC C CT 283 

I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M II I I I I I II I I I I I 
Db 64 AGGCCGTGACGGCCAGACTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCT 123 

Qy 284 GGGGGGCTGTTGCCACCTCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCAAAG 343 

I I I t I M I I I I M I I I I I M t I II I I I I I I II I I M I II I I I I I M I I I I II II I I II M 

Db 124 GGGGGGCTGTTGCCACCTCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCAAAG 183 

Qy 344 TGGGACAATATATTTGTAAAGATCCAAAAATAAATGACGCTACGCAAGAACCAGTTAACT 403 

I I I I I I I I I I I II I I I I I I I I II I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 184 TGGGACAATAT AT TT GT AAAGAT CCAAAAATAAAT GAC GCT ACGCAAGAAC CAGT T AACT 243 

Qy 404 GTACAAACTACACAGCT CATGTTTCCT GTTTT CCAGCACCCAACATAACTT GTAAGGATT 463 

I II I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I I M I I I II I I I I I II I I I I I I I II 
Db 244 GTACAAACTACACAGCT CAT GT TT C CT GTTTT C CAGCAC CCAACATAACTT GTAAGGATT 303 

Qy 464 CCAGT GGCAAT GAAACACATTTTACT GGGAACGAAGTTGGTTTTTTCAAGCCCATAT CTT 523 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 
Db 304 CCAGTGGCAATGAAACACATTTTACTGGGAACGAAGTTGGTTTTTTCAAGCCCATATCTT 363 

Qy 524 GCCGAAATGTAAATGGCTATTCCTACAAAGTGGCAGTCGCATTGTCTCTTTTTCTTGGAT 583 

I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I 

Db 364 GCCGAAATGTAAATGGCTATTCCTACAAAGTGGCAGTCGCATTGTCTCTTTTTCTTGGAT 423 

Qy 584 GGTTGGGAGCAGATCGATTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTTTGCA 643 

II I I I I I I II M I I I II M M I I I II II I I M M I M I I I I I I I I I M II I I II 

Db 424 GGTTGGGAGCAGATCGATTTTACCTTGGATTACCCTGCTTGGGTTTGTTAAAGTTTTGCA 483 

Qy 644 CTGTAGGGTTTTGTGGAATTGGGAGCCTAATTGATTTCATTCTTATTTCAATGCAGATTG 703 

I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I 

Db 484 CTGTAGGGTTCTGTGGAATTGGGAGGGT-AAT-T-GAT-T-T-CAT-T-CT-^ 



Qy 704 T T G GAC CTT C AGAT GGAAGT AGT T AC AT T AT AGAT T AC TAT GGAAC C AGAC TT AC AAG AC 7 63 

II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II I I I I II 
Db 544 TTGGACCTTCAGATGGAAGTAGTTACCTTATAGATTACTATGGAACCAGACTTACAAGAC 603 

Qy 764 T G AGT ATT AC T AAT GAAA- CAT T T AGAAAAAC G C AAT TAT AT C C AT AA 810 

I I II I II II II I II I I II I I M I II I I I II I II I I I I I II I I II II I 
Db 604 T GAGT AT TACT AAT GAAAC C ATT T AGAAAAAC G C AAT TAT AT C C AT AA 651 



RESULT 10 

BQ639765 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 



JOURNAL 
MEDLINE 
PUBMED 
COMMENT 



FEATURES 

source 



BQ639765 615 bp mRNA linear EST 15-JUL-2002 

he20a04.yl Human Retina cDNA (Un-normalized, unamplified) : hd/he 
Homo sapiens cDNA clone he20a04 5', mRNA sequence. 
BQ639765 

BQ639765. 1 GI: 21764224 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 615) 

Wistow,G., Bernstein, S.L. , Wyatt,M.K., Ray,S., Behal,A., 

Touchman, J.W. , Bouffard,G., Smith, D. and Peterson, K. 

Expressed sequence tag analysis of human retina for the NEIBank 

Project: Retbindin, an abundant, novel retinal cDNA and alternative 

splicing of other retina-preferred gene transcripts 

Mol. Vis. 8 (4), 196-204 (2002) 

22103461 

12107411 

Contact: Wis tow G 

Section on Molecular Structure and Function 
National Eye Institute 

6/331, NIH, Bethesda, MD 20892-2740, USA 

Tel: 301 402 3452 

Fax: 301 496 0078 

Email: graeme@helix.nih.gov 

Plate: 20 row: a column: 04 

Seq primer: M13RP1 reverse primer (ABI ) . 

Location/Qualifiers 

1. .615 

/organism= M Homo sapiens" 

/mol_type="mRNA" 

/db_xref="taxon: 9606" 

/clone="he20a04" 

/tissue_type="Retina" 

/dev_stage="Adult" 

/lab_host="EMDHlOB" 

/clone_lib="Human Retina cDNA (Un-normalized, 
unamplified) : hd/he" 

/note="Organ: Eye; Vector: pSPORTl; Neural retina tissue 
was dissected from two 80 year old donors with no observed 
eye disease. lOOug of total RNA was used for library 
construction. A directionally cloned cDNA library in the 
pSPORTl vector (Life Technologies) was constructed at 

Bi-oserve-Biotechnol-ogy— (Laurel— MD-)—es-sen-t-i-a-l-l-y— following 

the protocols of the Superscript Plasmid System full 
details of which are contained in the manufacturer's 
Instruction manual (http://www.lifetech.com/). First 
strand synthesis was carried out using a Not I 
primer-adapter 

[ 5 1 -pGACTAGTTCTAGATCGCGAGCGGCCGCCC (T) 15-3 1 ] • EST analysis 
was performed on the unamplified library at the NIH 
Intramural Sequencing Center (NISC) . " 



ORIGIN 



Query Match 75.7%; Score 613.4; DB 13; Length 615; 

Best Local Similarity 99*8%; Pred. No. 1.5e-158; 

Matches 614; Conservative 0; Mismatches 1; Indels 0; Gaps 



0; 



Qy 193 GCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGAGGCCGTGACGGCCAGACTCGTTGGTGTC 252 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1 GCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGAGGCCGTGACGGCCAGACTCGTTGGTGTC 60 

Qy 253 CTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCTGTTGCCACCTCCGCCGGGGGC 312 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 CTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCTGTTGCCACCTCCGCCGGGGGC 120 

Qy 313 GAGGAGT CGCTTAAGT GCGAGGAC CTCAAAGT GGGACAATATATTTGTAAAGAT CCAAAA 372 

I M I I I I II I M M M M II I I I I I I I I I I I I II II I I I M I II I I I M I I I M M I II I 

Db 121 GAGGAGT CGCT TAAGT GC GAGGAC CT CAAAGT GGGACAAT AT AT TT GTAAAGAT CCAAAA 180 

Qy 373 ATAAAT GACGCTACGCAAGAACCAGTT AACT GT ACAAACT ACACAGCT CAT GTTT CCT GT 432 

I II I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I 
Db 181 ATAAATGACGCTACGCAAGAAC CAGTTAAC T GTACAAAC TACACAGCT CATGT T TC CTGT 240 

Qy 433 T TT C CAGCAC CCAACAT AACT T GT AAGGATT C CAGT GGCAAT GAAACACAT TT TACTGGG 492 

I I I I II I II I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I I 
Db 241 TTTCCAGCACCCAACATAACTT GTAAGGATTCCAGT GGCAAT GAAACACAT TTT ACT GGG 300 

Qy 493 AACGAAGTTGGTTTTTTCAAGCCCATATCTTGCCGAAATGTAAATGGCTATTCCTACAAA 552 

I I I I I I I I I 1 I I I I I II I I I I II I I I II I I I II M I I I I I I II I I I I I I I I I I I II II I I 
Db 301 AACG7VAGTTGGTTTTTTCAAGCCCATATCTTGCCGAAATGTAAATGGCTATTCCTACAAA 360 

Qy 553 GTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGATTTTACCTTGGA 612 

I I I II I II I I I M I I I M I I II I II I I I I I II I I I I I I I I II II I I I II I M I I II II I I 
Db 361 GTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGATTTTACCTTGGA 420 

Qy 613 TACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCTA 672 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
Db 421 TACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCTA 4 80 

Qy 673 ATTGATTTCATTCTTATTTCAATGCAGATTGTTGGACCTTCAGATGGAAGTAGTTACATT 732 

I II I I I I I I II II I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
Db 481 AT T GATTT CAT T CT T ATTT CAAT GCAGATT GTT GGACCT T CAGAT GGAAGTAGT T ACATT 540 

Qy 733 ATAGATTACTATGGAACCAGACTTACAAGACTGAGTATTACTAATGAAACATTTAGAAAA 792 

I I I I I I I I I I I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I II I I I I 
Db 541 AT AGAT TACT AT G GG AC C AGACT T ACAAGAC T GAGT AT T AC T AAT GAAAC AT T T AGAAAA 600 

Qy 793 AC GCAATTATATC CA 807 



Db 60T~ACGCAATTATATCCA-61-5- 



RESULT 11 

CB310671 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 



CB310671 772 bp mRNA , linear EST 04-MAR-2003 

AGENCOURT_11828318 NICHD_Rh_Ovl Macaca mulatta cDNA clone 
IMAGE: 6895132 5', mRNA sequence. 
CB310671 

CB310671. 1 GI: 28833385 
EST . 



SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
COMMENT 



FEATURES 

source 



Macaca mulatta (rhesus monkey) 
Macaca mulatta 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Primates; Catarrhini; Cercopithecidae; 
Cercopithecinae; Macaca. 
1 (bases 1 to 772) 

NCI-CGAP http : //www . ncbi . nlm. nih . gov/ ncicgap . 

National Cancer Institute, Cancer Genome Anatomy Project (CGAP) , 
Tumor Gene Index 
Unpublished (1997) 
Contact: Robert Strausberg, Ph.D. 
Email: cgapbs-r@mail.nih.gov 
Tissue Procurement: Dr. Eliot Spindel 
cDNA Library Preparation: CLONTECH 

cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 
DNA Sequencing by: Agencourt Bioscience Corporation 
Clone distribution: NCI-CGAP clone distribution information can be 
found through the I.M.A.G.E. Consortium/LLNL at: 
http: //image. llnl.gov 
Plate: LLCM3162 row: c column: 03 
High quality sequence stop: 646. 
Location/Qualifiers 
1. .772 

/organism="Macaca mulatta" 

/mol_type="mRNA" 

/db_xref="taxon: 9544" 

/clone=" IMAGE: 6895132" 

/tissue_type="Ovary" 

/lab_host="DH10B (phage-resistant ) " 

/clone_lib="NICHD_Rh_Ovl" 

/note="0rgan: ovary; Vector: pDNR-LIB; Site_l: Sfi I; 
Site_2: Sfi I; Cloned unidirectionally . Primer: Oligo dT. 
Average insert size 1.0-4.0 kb. Tissue pooled from 
pre-pubertal, post pubertal sn menopausal monkeys. 
Constructed by Clontech. Note: this is a NICHD Library." 



ORIGIN 



Query Match 74.9%; Score 606.4; DB 14; Length 772; 

Best Local Similarity 96.0%; Pred. No. 1.4e-156; 

Matches 622; Conservative 0; Mismatches 26; Indels 0; 



Gaps 



0; 



Qy 
Db 

Qy 



163 GGCGAGAAAGTGTCGGTCTCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCG 

|| I M II I M I I I I I I I I I I I I I I I I I I I I I MINIM III I I M M I I II I 

1 GGGAGGGAAGTGTCGGTCTCCAAGATGGCGGCCGCGTGGCCGTCAGGTTCGTCTGCTCCG 



223 



222 



60 



282 



Db 



GAGGCCGTGACGGCCAGACTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCC 

— — |-|-|-|-riM— 1|-|— hi— i-l-l-l-l-l— I— l-l-l-H-l-l-t-l-l-l-l-IH-HHH-l-HH- 

61 GAGGCCGCGACTGCTAGACTCCTCGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCC 120 



Qy 

Db 

Qy 

Db 



283 TGGGGGGCTGTTGCCACCTCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCAAA 342 

I | | || | II I II I II II II I I I I I I I I I I I I I M M I II M I I M II I M II M I I I I 

121 TGGGGGGCTGTTGCCACCTCTGCCGGGGGCGAGGAGTTGCCTAAGTGCGAGGACCTCAAA 180 

343 GT GGGACAATATATTT GTAAAGAT CCAAAAATAAAT GACGCTACGCAAGAACCAGTTAAC 402 

| | | | | | | | | | || I || II II II I I I I I I I I I I II I I I I I I I I I I I I M I I I I II I I I I I I 
181 GT GGGACAAT ATAT T T GTAAAGAT C CAAAAATAAAT GAT GC T ACGCAAGAAC CAGT T AAC 240 



Qy 

Db 



403 
241 



T GTACAAACTACACAGCT CAT GTTTCCT GTTTT CCAGCACCCAACATAACTTGTAAGGAT 4 62 
I I I I I I I I i I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I f I I I I 
T GT AC AAAC TACAC AG C T CAT GTTTCCTGTTTT CC AGC AC C TAAC AT AAC T T GTAAGGAT 300 



Qy 463 T CC AGT G GCAAT GAAAC ACAT T TT ACT GGGAAC GAAGT T GGT TT T TT CAAGC C CAT AT CT 522 

I I I I I I I II I I I I I I I I I I M I I I I I I I M I I I I I I I I I I I I I I I I I I I I II I I I I I I I 
Db 301 TCCAGTGGCAATGAAACACATTTTACT GGGAAT GAAGT T GGT TTTTT CAAGCCCATATCT 360 

Qy 523 TGCCGAAATGTAAATGGCTATTCCTACAAAGTGGCAGTCGCATTGTCTCTTTTTCTTGGA 582 

I I I I I I I I I I I I II I I I I II I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 
Db 361 T GCC GAAAT GT AAAT GGCT AT T CATACAAAGT GGCAGT C GCATT GTCT CT TTT T CT T GGA 420 

Qy 583 TGGTTGGGAGCAGATCGATTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTTTGC 642 

I I I I M I I I I I M I I M I II I I I I I I I I I I I I I I I II I I I I I I II I I I II I I II M II 
Db 421 TGGTTGGGAGCAGATCGATTTTACCTGGGATACCCTGCCTTGGGTTTGTTAAAGTTTTGC 480 

Qy 643 ACT GT AGGGTT TT GT GGAATT GGGAGC CT AATT GAT TT CAT T CT T ATT TCAAT GCAGAT T 7 02 

I I I I I I I I I I I I II I I I I I I I I I I I I II I I I II I I I I I I I I II I I I I I I I I I I I I I I I I 

Db 481 ACTGTAGGATTTTGTGGAATTGGGAGCCTAATTGATTTCATTCTTATTTCAATGCAGATT 540 

Qy 703 GT T GGAC CTT CAGAT GGAAGTAGT TACATT AT AGAT TACT AT GGAACCAGACTTACAAGA 7 62 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I II I I II I I MINI 

Db 541 GT T GGAC CTT CAGAT GGAAGTAGT T ACAT CAT AGATTACTAT GGAAC CAGACT GACAAGA 600 

Qy 763 CT GAGTATTACTAAT GAAACATTTAGAAAAACGCAATTATAT CCATAA 810 

II I I I I I I I II I I I I I II I II I I I M I I I I I I I II I I I M I II I I I 
Db 601 C T AAGT AT T ACT AAT GAAAC AT AT AGAAAAAC GCAAT TAT AT CCATAA 648 



RESULT 12 

BI562596 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



BI562596 950 bp mRNA linear EST 05-SEP-2001 

603256530F1 NIH__MGC_97 Homo sapiens cDNA clone IMAGE : 5298943 5', 
mRNA sequence. 
BI562596 

BI562596.1 GI: 15449910 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chorda ta; 
Mammalia; Eutheria; Primates; 
1 (bases 1 to 950) 
NIH-MGC ht tp : / /mgc . nci . nih . gov/ . 

National Institutes of Health, Mammalian Gene Collection (MGC) 
Unpublished (1999) 
Contact: Robert Strausberg, Ph.D. 

Ema iT: — cgapb s-^r @ma i-l-vnih^-g o v 



Craniata; Vertebrata; Euteleostomi ; 
Catarrhini; Hominidae; Homo. 



Tissue Procurement: Miklos Palkovits, M.D., Ph.D. 

cDNA Library Preparation: Michael J. Brownstein (NHGRI), Shiraki 
Toshiyuki and Piero Carninci (RIKEN) 

cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 

DNA Sequencing by: Incyte Genomics, Inc. 

Clone distribution: MGC clone distribution information can be 
found through the I.M.A.G.E. Consortium/LLNL at: 
http : / / image . llnl . gov v 
Plate: LLAM11756 row: k column: 08 
High quality sequence stop: 753. 



\ 



FEATURES Location/Qualifiers 
source 1. .950 



/organism= l! Homo sapiens" 
/mol_type="mRNA" 
/db_xref="taxon: 9606" 
/clone- "IMAGE: 5298943" 
/lab_host="DH10B" 
/clone_lib="NIH_MGC__97" 

/note="Organ: testis; Vector: pBluescriptR (modified 
pBluescript KS+) ; Site_l: BamHI; Site_2: Sall-Xhol 

(gtcgag) ; Oligo-dT primed using primer 
5 » -TTTTTTTTTTTTTTTTVN-3 1 , size-selected for average 
insert size 2.2 kb and normalized to ROT 5. This is a 
primary library enriched for full-length clones and 
constructed using the Cap-trapper method (Carninci, in 
preparation) . Library constructed by M. Brownstein 

(NIMH/NHGRI, National Institutes of Health) . Note: this is 
a NIH_MGC Library." 



ORIGIN 



Query Match 74.8%; Score 606; DB 12; Length 950; 

Best Local Similarity 98.6%; Pred. No. 1.9e-156; 

Matches 643; Conservative 0; Mismatches 5; Indels 4; Gaps 3; 

Qy 161 GTGGCGAGAAAGTGTCGGTCTCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTC 220 

| | | | | | | | I I I I I I I I I I I II I I I I I I I I I I I I I I M I I I I I I I I I I I I I M I I I I I I 
Db 4 GGGGCGTGAAAGTGTCGGTCTCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTC 63 

Qy 221 CGGAGGC CGT GAC GGC CAGACT C GTT GGT GT CCT GT GGTT C GT CT CAGT C ACTACAGGAC 280 

| M I I M I M I I I I I I I II I I I I II I I I I I I I I II I I I I I I I I I I M I I I I I I I I I I I I I 
Db 64 CGGAGGC CGT GAC GGCCAGACT C GTT GGT GT CCT GT GGTT C GT CT C AGT CACT ACAGGAC 123 

Qy 281 CCTGGGGGGCTGTTGCCACCTCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCA 340 

| | | | M I I I I I I I I I I I I II M I I I I I I I I I I I I I I I I I I M I I I I 

Db 124 CCTGGGGGGCTGTTGCCACCTCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCA 183 

Qy 341 AAGT GGGACAAT AT ATT T GT AAAGAT C CAAAAAT AAAT GACGCTAC GCAAGAAC CAGT TA 400 

| M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I M I I I I I I I I I I I I I I I 
Db 184 AAGT GGGACAAT AT AT T T GT AAAGAT C CAAAAAT AAATGACGCTAC GCAAGAAC CAGT TA 243 

Qy 401 ACT GT ACAAACT ACACAGCT CAT GTTTCCTGTTT T CCAGCAC C CAACAT AACT T GT AAGG 460 

| | | I I I M I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I 

Db 244 ACT GT ACAAACT ACACAGCT CAT GTTTCCTGTTT T CCAGCAC C CAACAT AACT T GT AAGG 303 

Qy 461 ATTCCAGTGGCT^ATGAAACACATTTTACTGGGAACGAAGTTGGTTTTTTCAAGCCCATAT 520 

| | | | | | || | | | I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I M I I I M I I I I I I I I I 
- m 3 0"4 — ATTCCAGTGGCAAT GAAACACATTTTAGT GGGAAGGAAGT-T-GGT-T-T-T-T-T-CAAGCCCATAT— 3 6 3- 

Qy 521 CTTGCCGAAATGTAAATGGCTATTCCTACAAAGTGGCAGTCGCATTGTCTCTTTTTCTTG 58 0 

| | | M | I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 364 CTTGCCGAAATGTAAATGGCTATTCCTACAAAGTGGCAGTCGCATTGTCTCTTTTTCTTG 423 

Qy 581 GAT GGTT GGGAGCAGATCGATTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTTT 64 0 

| | | | | || | | | | M | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I M 
Db 424 GAT GGTT GGGAGCAGATCGATTTTACCTTGGATACCCTGC-TTGGGTTTGTTAAAGTTTT 482 



Qy 



641 GCACTGTAGGGTTTTGTGGAATTGGGAGCCTAATTGATTTCATTCTTATTTCAATGCA 



698 







| | | | | | | | | M | | | | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 




Db 


483 


GCACT GT AGGG - T T T GT GGAATT GGGAGCCT AAT T GAT T TCATT CT TAT TT CAAT GCAAG 


541 


Qy 


699 


GATTGTTGGACCTTCAGATGGAAGTAGTTACATTA1AGAI IALIAlb^/\/\UL,/\ta/\^i 


758 




i i i i i i t i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i 1 t 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 
| | | 1 | | [ I I I M 1 1 M 1 M 1 1 1 1 M M 1 1 1 1 1 1 1 M 1 1 1 1 1 M 1 M 1 1 1 1 M I I I M 




Db 


542 


ATT GGT T GGAC CT T CAGAT GGAAGTAGTTACATT AT AGATT ACT AT G GAAC C AGACTT AC 


601 


Qy 


759 


AAGACTGAGTATTACTAATGAAACATTTAGAAAAACGCAATTATATCCATAA 810 






I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


602 


AAGACTGAGTATTACTAATGAAACATTTAGAAAAACGCAATTATATCCATAA 653 





RESULT 13 

BI596830 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



(MGC) 



Shiraki 



FEATURES 

source 



BI596830 901 bp mRNA linear EST 07-SEP-2001 

603243323F1 NIH_MGC_96 Homo sapiens cDNA clone IMAGE: 5285933 5', 
mRNA sequence. 
BI596830 

BI596830. 1 GI: 15489769 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 901) 
NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection 
Unpublished (1999) 
Contact: Robert Strausberg, Ph.D. 
Email : cgapbs-r@mail . nih . gov 

Tissue Procurement: Miklos Palkovits, M.D., Ph.D. 

cDNA Library Preparation: Michael J. Brownstein (NHGRI) , 
Toshiyuki and Piero Carninci (RIKEN) 

cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 

DNA Sequencing by: Incyte Genomics, Inc. 

Clone distribution: MGC clone distribution information can be 
found through the I.M.A.G.E. Consortium/ LLNL at: 
http : //image . llnl . gov 
Plate: LLAM11722 row: m column: 
High quality sequence stop: 760. 
Location/Qualifiers 
1. .901 

/organism="Homo sapiens" 
/mol_type="mRNA" 
/db_xref="taxon:9606" 
/clone- "IMAGE: 5285933" 

/ 1 is s u e- 1 yp e = "-hyp o thalamu s- 

/lab_host="DH10B M 
/clone_lib="NIH_MGC_96" 

/note="Organ: brain; Vector: pBluescriptR (modified 
pBluescript KS+) ; Site_l: BamHI ; Site_2 : Sall-Xhol 
(gtcgag) ; Oligo-dT primed using primer 
5 1 — TTTTTTTTTTTTTTTTVN— 3 1 t size-selected for average 
insert size 2 . 3 kb and normalized to ROT 5. This is a 
primary library enriched for full-length clones and 
constructed using the Cap-trapper method (Carninci, in 
preparation) . Library constructed by M. Brownstein 



06 



ORIGIN 



(NIMH/NHGRI, National Institutes of Health). Note: this is 
a NIH_MGC Library." 



Query Match 74.7%; Score 605.4; DB 12; Length 901; 

Best Local Similarity 98.7%; Pred. No. 2.8e-156; 

Matches 631; Conservative 0; Mismatches 6; Indels 2; Gaps 2; 

Qy 172 GTGTCGGTCTCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGAGGCCGTG 231 

I I I I I I I I I I I I I I I I I II I I III I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I 
Db 4 GGGTCGGTCTCCAAGATGGCGGACGCTTGGCCGTCTGGTCCGTCTGCTCCGGAGGCCGTG 63 

Qy 232 ACGGCCAGACTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCT 291 

I I II I I I I I I II I I II I I I I M II I I I I I I M I I I I M II II I I I I I I I I I I M I I I I I 
Db 64 ACGG-CAGACTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCT 122 

Qy 292 GTTGCCACCTCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCAAAGTGGGACAA 351 

I I I M I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I II I I M I I I 
Db 123 GTTGCCACCTCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCAAAGTGGGACAA 182 

Qy 352 T AT ATTT GT AAAGAT C CAAAAATAAAT GACGCTACGCAAGAACCAGT TAACT GTACAAAC 411 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 
Db 183 T AT AT TT GT C AAGAT C CAAAAATAAAT GAC GCT AC G CAAGAAC C AGT T AAC T GTACAAAC 242 

Qy 412 T ACACAGCTCAT GTTTCCTGTTTTC CAGCACCCAACATAACT T GT AAGGAT T CCAGT GGC 471 

I I I I I I I II I I I I I I I II I I I I II I II I II I I I M I I I I I M I M II I I I I I I I I I I I II 
Db 243 T ACACAGCT CAT GT TT CCT GT TTT C CAGCACCCAACAT AACT T GTAAGGATT CCAGT GGC 302 

Qy 472 AAT GAAACACAT T T T ACT GGGAACGAAGTT GGTT T TTT CAAGCC C AT AT CTT GCCGAAAT 531 

I I I I II II I II I I I I I II I I I I I I I I I I I M I I I I I I I I I I I I I I II I I I I I I 1 I I II I I 
Db 303 AATGAAACACATTTTACTGGGAACGAAGTTGGTTTTTTCAAGCCCATATCTTGCCGAAAT 362 

Qy 532 GTAAATGGCTATTCCTACAAAGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGA 591 

I I I II I II I II I I I I I I II I I I I I II I I I M I I I I I I I I I I I I I I I I II I I I I I II I I 
Db 363 GGT7^TGGCTATTCCTACA7^AGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGA 422 

Qy 592 GCAGAT CGAT T T T AC CT T GGAT AC CCT GCTTT GGGTT T GT TAAAGT T TT GCACT GTAGGG 651 

M I I M M I I M M M M I II II I M I I I I II I I M M II I II I II I I I I I I I I I II I I I 

Db 423 GCAGATCGATTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGG 482 

Qy 652 TTTTGTGGAATTGGGAGCCTAATTGATTTCATTCTTATTTCAATGCAGATTGTTGGACCT 711 

II M M I I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I I I I I I I I I I M I I I I I 
Db 483 -TTTGTGGAATTGGGAGCCTAATTGATTTCATTCTTATTTCAATGCAGATTGTTGGACCT 541 

Qy 712 TCAGATGGAAGTAGTTACATTATAGATTACTATGGAACCAGACTTACAAGACTGAGTATT 771 

I I I I I I I II I I I I I I II I I I II I II I I I I I I I II I I I I II I II I I I I I I I I M I I I I II I 
"Db 54 2 — T CAGAT GGAAGT AGT T ACAT T AT AGATT ACTAT GGAAC GAGAGT-T-ACAAGACTGAGTATT_6.0.1_ 



Qy 772 AC T AAT G AAAC AT T T AG AAAAAC G C AAT TAT AT C C AT AA 810 

I I I I II I I I I I II I I I I I I I I I I I I I I I II II I I I I I I I 
Db 602 ACT AAT GAAAC AT T T AGAAAAAC GC AATT AT AT C C AT AA 64 0 



RESULT 14 
BI596662 

LOCUS BI596662 908 bp mRNA linear EST 07-SEP-2001 

DEFINITION 603243232F1 NIH_MGC_96 Homo sapiens cDNA clone IMAGE: 5285982 5', 



mRNA sequence. 
ACCESSION BI596662 

VERSION BI596662.1 GI: 15489601 

KEYWORDS EST . 

SOURCE Homo sapiens (human) 

ORGANISM Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
REFERENCE 1 (bases 1 to 908) 

AUTHORS NIH-MGC http://mgc.nci.nih.gov/. 

TITLE National Institutes of Health, Mammalian Gene Collection (MGC) 

JOURNAL Unpublished (1999) 
COMMENT Contact: Robert Strausberg, Ph.D. 

Email : cgapbs-rdmail . nih. gov 

Tissue Procurement: Miklos Palkovits, M.D., Ph.D. 

cDNA Library Preparation: Michael J. Brownstein (NHGRI), Shiraki 
Toshiyuki and Piero Carninci (RIKEN) 

cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 

DNA Sequencing by: Incyte Genomics, Inc. 

Clone distribution: MGC clone distribution information can be 
found through the I.M.A.G.E. Consortium/ LLNL at: 
http : / / image . llnl . gov 
Plate: LLAM11722 row: o column: 07 
High quality sequence stop: 755. 
FEATURES Location/Qualifiers 
source 1. .908 

/organism="Homo sapiens" 

/mol_type="mRNA" 

/db_xref="taxon: 9606" 

/clone="IMAGE: 5285982" 

/tissue_type="hypothalamus" 

/lab_host="DH10B" 

/ cl one_l ib= "N I H_MGC_9 6 " 

/note="Organ: brain; Vector: pBluescriptR (modified 
pBluescript KS+) ; Site_l: BamHI; Site_2 : Sall-Xhol 

(gtcgag) ; Oligo-dT primed using primer 
5 i -TTTTTTTTTTTTTTTTVN-3 1 , size-selected for average 
insert' size 2 . 3 kb and normalized to ROT 5. This is a 
primary library enriched for full-length clones and 
constructed using the Cap-trapper method (Carninci, in 
preparation) . Library constructed by M. Brownstein 

(NIMH/NHGRI, National Institutes of Health) . Note: this is 
a NIH_MGC Library." 

ORIGIN 

Query Match 74.6%; Score 604.4; DB 12; Length 908; 

Bes t~ Local— Simila rity 9 8v6% ; Pr ed— No— 5^-3e— 1-5 6 

Matches 631; Conservative 0; Mismatches 6; Indels 3; Gaps 2; 

172 GTGTCGGTCTCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGAGGCCGTG 231 

I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I M I I I I 

4 GGGTCGGTCTCCAAGATGGCGGCCGCTTGGCCGTCTGGTCCGTCTGCTCCGGAGGCCGTG 63 

232 ACGGCCAGACTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCT 291 

| | I I I M I I I I I I I I I M I I I II I I I I I I I I I M I I I I I I I I M I I I I I I I I I I I I I M I 
64 ACGGCCAGACTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCT 123 



Qy 

Db 



Qy 


292 


Db 


124 


Qy 


352 


Db 


184 


Qy 


412 


Db 


244 


Qy 


471 


Db 


304 


Qy 


531 


Db 


364 


Qy 


591 


Db 


424 


Qy 


651 


Db 


484 


Qy 


711 


Db 


542 


Qy 


771 


Db 


602 



GTTGCCACCTCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCAAAGTGGGACAA 351 

| | | | | | | | | | I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I M M I I I I I M I M I N 
GTTGCCACCTCCGCCGGGGGCGAGGAGTC'GCTTAAGTGCGAGGACCTCAAAGTGGGACAA 183 

TAT ATT T GT AAAGAT C CAAAAATAAAT GACG CT AC GCAAGAAC CAGTT AACT GTACAAAC 411 

| | | | | | | I || I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
TATATTT GTAAAGAT CCAAAAATAAATGACGCTACGCAAGAACCAGTTAACT GTACAAAC 243 

TACACAGCT CAT GTTTCCTGTTTTC C AG CAC C CAACAT AACT T - GT AAGGAT T CCAGT GG 47 0 

I | | | | | | I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I Mill I I I I I I 

TACACAGCT CAT GT T TC CT GTTTT CCAGCAC C CAACAT AACT T GGT AAGGAT T CCAGT GG 303 



| | I I M I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

CAAT GAAACACATT T T ACT GGGAAC GAAGTT GGTT T TTT CAAGC C CAT AT CT T GC CGAAA 363 

TGTAAATGGCTATTCCTACAAAGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGG 590 
| | I I I I I I I I I I I I I I I I I I I I I I II I I I I I I M I I I I I I I I I I I I I I I I I I I M I I I I I 
TGTAAATGGCTATTCCTACAAAGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGG 423 



AGCAGATCGATTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGG 650 

M | | | | | | I II I I II I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I 

AGCAGATCGATTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGG 



483 



GT T TT GT GGAATT GGGAGCCTAAT T GAT TT C ATT CTT AT TT CAAT GCAGATT GT T GGACC 710 

| | M I I II I I I I I M I I I I I I I I I II I II I II M I I I I I I I I I I I I I I I I I II I 

GTTT — GTGGAATGGGAGCCTAATTGATTTCATTCTTATTTCAATGCAGATTGTTGGACC 541 
TTCAGAT GGAAGTAGTTACATTATAGATTACTAT GGAACCAGACTTACAAGACTGAGT AT 770 

I I I I I I M I I I I I I II I I I I I I I I I I I I II I I I I I I I I I I I M M I I I I I I I I I I I I I I I 

TTCAGAT GGAAGT AGTTACAT TATAGATTACTAT GGAACCAGACTTACAAGACTGAGT AT 601 
T ACT AAT GAAAC AT T T AGAAAAAC GC AAT TAT AT C CAT AA 810 

I I M I I I I M I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 

TACT AAT GAAAC AT T T AGAAAAAC GC AAT TAT AT C CAT AA 641 



RESULT 15 

AI923178 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM" 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
COMMENT 



AI923178 599 bp mRNA linear EST 02-SEP-1999 

wn67bl0.xl NCI_CGAP_Lul9 Homo sapiens cDNA clone IMAGE: 2450491 3 1 
similar to WP:C02F5.3 CE00039 GTP-BINDING PROTEIN ;, mRNA sequence. 
AI923178 

AI923178.1 GI:5659142 
EST. 

Homo sapiens (human) 
~H omo~~s~ap ire ns ~ 
Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 599) 

NCI-CGAP http : / /www . ncbi . nlm. nih . gov/ ncicgap . 

National Cancer Institute, Cancer Genome Anatomy Project (CGAP) , 
Tumor Gene Index 
Unpublished (1997) 

Contact: Robert Strausberg, Ph.D. 
Email : cgapbs-r@mail . nih . gov 

Tissue Procurement: Christopher Moskaluk, M.D., Ph.D., Michael R. 



FEATURES 

source 



Emmert-Buck, M.D., Ph.D. 
cDNA Library Preparation: M. Bento Soares, Ph.D. 
cDNA Library Arrayed by: Greg Lennon, Ph.D. 

DNA Sequencing by: Washington University Genome Sequencing Center 
Clone distribution: NCI-CGAP clone distribution information can be 
found through the I.M.A.G.E. Consortium/LLNL at: 
www-bio . llnl . gov/bbrp/image/ image . html 
Seq primer: -4 0UP from Gibco 
High quality sequence stop: 457. 
Location/Qualifiers 
1. .599 

/organism-"Homo sapiens" 
/mol_type="mRNA M 
/db_xref="taxon: 9606" 
/clone=" IMAGE: 2450491" 

/tissue_type="squamous cell carcinoma, poorly 
differentiated (4 pooled tumors, including primary and 
metastatic) " 
/ de v_s t age= " adult " 

/lab_host="DH10B (phage-resistant ) " 
/ clone_l ib= "NCI_CGAP_Lul 9 " 

/note="Organ: lung; Vector: pT7T3D-Pac (Pharmacia) with a 
modified polylinker; 1st strand cDNA was prepared from 
pooled lung tumor tissue, and was then primed with a Not I 
- oligo(dT) primer. Double-stranded cDNA was ligated to 
Eco RI adaptors (Pharmacia), digested with Not I and 
cloned into the Not I and Eco RI sites of the modified 
pT7T3 vector. Library went througb one round of 
normalization. Library constructed by Bento Soares and M. 
Fatima Bonaldo. " 



ORIGIN 



Query Match 73.5%; 
Best Local Similarity 99.5%; 
Matches 596; Conservative 



Score 595.4; DB 9; 
Pred. No. 1.4e-153; 
0; Mismatches 3; 



Length 599; 
Indels 0; 



Gaps 



0; 



Qy 



Db 



190 



GCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGAGGCCGTGACGGCCAGACTCGTTGGT 249 

| | | I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I M I 

1 GCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGAGGCCGTGACGGCCAGACTCGTTGGT 60 



Qy 

Db 

Qy 

-Db" 



250 



61 



GTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCTGTTGCCACCTCCGCCGGG 

| | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I 

GTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCTGTTGCCACCTCCGCCGGG 



309 



120 



310 GGC GAGGAGT CGCT T AAGT GC GAGGAC CT CAAAGT GGGACAAT AT ATTTGT AAAGAT C CA 369 
| | | I I I I I I I M I I I I I I II I I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I 

~I2 1 — GGCGAGGAGT CGCTTAAGT GCGAGGACCT CAAAGTGGGACAATAT-AT-T-TGT-AAAGAT-GGA— 1-8 0- 



Qy 

Db 

Qy 

Db 



370 AAAATAAAT GACGCTACGCAAGAACCAGTTAACTGT ACAAACTACACAGCT CAT GTTTCC 429 

| | | I M I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

181 AAAATAAAT GACGCTACGCAAGAACCAGTTAACTGT ACAAACTACACAGCT CAT GTTTCC 240 

430 T GT TT T CCAGCAC C CAACAT AACT T GT AAGGATT CCAGT GGCAATGAAAC ACAT TT T ACT 489 

| | | | | | | M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 

241 T GT TT T CCAGCAC CC CAC AT AACTT GT AAGGATT C CAGT GGCAAT GAAAC ACATTTT ACT 300 



490 GGGAAC GAAGT TGGTTTTTT CAAGC CCAT AT CT T G C C GAAAT GTAAATGGCTAT T C CT AC 549 



Db 


301 


Ov 


550 


Db 


361 


Ov 


610 


Db 


421 


Qy 


670 


Db 


481 


Qy 


730 


Db 


541 



I I I I I I M | I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M M I M I I I 

GGGAACGAAGTTGGTTTTTTCAAGCCCATATCTTGCCGAAATGTAAATGGCTATTCCTAC 3 60 

AAAGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGATTTTACCTT 609 

| M I I I I I I I I I I I I I I I M I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

AAAGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGATTTTACCTT 420 

GGATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGC 669 

| | I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GGATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTANGGTTTTGTGGAATTGGGAGC, 480 

CT AAT T GAT TT C ATT CT TAT TT CAAT GCAGATT GTT GGAC CTT CAGAT GGAAGT AGT TAC 729 

I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I M II I I I I I I I I I I I I I I I I I I I I I ! I 

CTAATT GAT T T CATT CTT AT TT CAAT GCAGAT T GTT GGACCT T CAGAT G GAAGT AGT TAC 540 
ATT AT AGATT ACT AT GGAAC CAGACT TACAAGACT GAGT AT T ACTAAT GAAACATT T AG 78 8 

I I I I I I I I I I I I I I I I I I I I I I M I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I III 

AT TAT AGAT TACT AT GGAAC CAGACT TACAAGACT GAGT ATT ACTAAT GAAAC ATNT AG 599 



Search completed: March 4, 2004, 09:16:38 
Job time : 2551 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 

Run on: March 4, 2004, 05:35:01 ; Search time 3285 Seconds 

(without alignments) 
10687.323 Million cell updates/sec 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



US-09-852-100B-1 
810 

1 atgcatattttaaaagggtc aaacgcaattatatccataa 810 

IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 



Searched: 3470272 seqs, 21671516995 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 



6940544 



Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



Database : GenEmbl : * 



1: 




gb_ba : * 


2: 




gb htg:* 


3' 




gb in:* 


4 




gb_om: * 


5 




gb ov:* 


6 




gb pat:* 


7 




gb ph : * 


8 




gb_pl : * 


9 




gb_pr : * 


10: 


gb ro : * 


11: 


gb_sts : * 


12: 


gb_sy : * 


13: 


gb un : * 


14: 


gb_vi : * 


15: 


em ba : * 


16: 


em fun : * 


-17r 


em hum: * 


1 


B: 


em in : * 


19: 


em_mu : * 


20: 


em om : * 


21: 


em or : * 


22: 


em o v : * 


23: 


em pat : * 


24: 


em_ph : * 


25: 


em pi : * 


26: 


em ro : * 


27: 


em sts:* 



28 . 


em 


un: * 


29 


em 


vi : * 


30 


em_ 


htg hum:* 


31 


em 


htg inv:* 


32 


em 


htg_other : * 


33 


em_ 


htg_mus : * 


34 


em 


htg_pln : * 


35 


em_ 


htg_rod: * 


36 


em_ 


htg mam: * 


37 


em 


_htg_vrt : * 


38 


i em_ 


sy: * 


39 


: em 


htgo hum:* 


40 


: em_ 


htgo_mus : * 


41 


: em 


htgo other: 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 

% 

Result Query 



No. 


Score 


Match 


Length 


DB 


ID 


Description 


1 


810 


100. 


0 


810 


6 


BD243134 


BD243134 6-Protein 


2 


810 


100. 


0 


1246 


9 


AF353990 


AF353990 Homo sapi 


3 


645 


79. 


6 


984 


9 


BC029486 


BC029486 Homo sapi 


4 


602.8 


74. 


4 


970 


6 


BD139411 


BD139411 Extended 


5 


499 


61. 


6 


508 


6 


BD265227 


BD2 65227 Compounds 


6 


499 


61. 


6 


508 


6 


BD265239 


dU^djZ jy Lompounas 


7 


499 


61. 


6 


508 


6 


AR401213 


AR401213 Sequence 


8 


499 


61. 


6 


508 


6 


AR401225 


AR401225 Sequence 


9 


499 


61. 


6 


508 


6 


AX192666 


AX192666 Sequence 


10 


499 


61. 


6 


508 


6 


AX192678 


AX192678 Sequence 


11 


471.6 


58. 


2 


630 


10 


AF353993 


AF353993 Mus muscu 


12 


436.8 


53. 


9 


440 


6 


BD076181 


BD076181 5' EST of 


c 13 


430.6 


53. 


2 


193660 


2 


AC102262 


AC102262 Mus muscu 


14 


425.2 


52. 


5 


455 


6 


BD076249 


BD076249 5* EST of 


15 


411.4 


50. 


8 


487 


6 


AX892343 


AX892343 Sequence 


16 


411.4 


50. 


8 


487 


6 


BD027876 


BD027876 Sequence 


c 17 


354 


43. 


7 


183425 


9 


AC097064 


AC097064 Homo sapi 


c 18 


354 


43. 


7 


239704 


9 


AC099791 


AC099791 Homo sapi 


c 19 


325.4 


40. 


2 


228458 


2 


AC097670 


AC097670 Rattus no 


20 


299 


36.9 


176056 


10 


AC073437 


AC073437 Mus muscu 


21 


299 


36. 


9 


196421 


10 


AL672100 


AL672100 Mouse DNA 


22 


276 


34. 


1 


185576 


2 


AC025691 


AC025691 Homo sapi 








-9" 


-2-31-i50- 


— 2- 




Ael-14-l-95-Ratrtus-no— 


c 23 


250 . 4 


30^ 


-Aei-i4-195 




24 


244.4 


30 


2 


157999 


2 


AC117088 


AC117088 Rattus no 


25 


209.8 


25 


9 


277191 


2 


AC109077 


AC109077 Rattus no 


26 


186.2 


23 


0 


167627 


9 


AC079382 


AC079382 Homo sapi 


27 


145.6 


18 


0 


178068 


2 


AC142046 


AC142046 Rattus no 


c 28 


114 


14 


1 


129705 


2 


AC133258 


AC133258 Rattus no 


29 


114 


14 


1 


239113 


2 


AC094034 


AC094034 Rattus no 


30 


114 


14 


, 1 


324462 


2 


AC137263 


AC137263 Rattus no 


c 31 


107.6 


13 


.3 


145871 


2 


AC143611 


AC143611 Macaca mu 


32 


99.8 


12 


.3 


298 


6 


E25986 


E25986 Blastocyst 


33 


99.2 


12 


.2 


240950 


2 


AC098287 


AC098287 Rattus no 



c 


34 


97 


. 8 


12 . 


1 


1312 lo 


1 u 


AIjO / ± ± H u 


AT, 671140 


I Mouse DNA 


c 


35 


73 


. 2 


9 . 


0 


1 n n o n C 

19839b 


o 
Z 


Ar*i m Ton 
AL. JL U X / u u 


API 01700 


Mus ITLUSCU 




36 


64 


. 6 


8 . 


0 


128444 


z 


al, uiyyz^i 


Arm QQ24 


Drosophil 




37 


64 


. 6 


8 . 


0 


1326 J / 


o 


al uuouyz 


at nn^092 


Drosophil 


c 


38 


64 


. 6 


8 . 


0 


149592 




, AL UUO / 1 O 


aron S71 8 


Drosoph.il 


c 


39 


64 


. 6 


8 . 


0 


179139 


Q 


a r ft Q Q "5 d n 




Drosophil 


c 


A ft 

4 U 




. D 


p 


u 




3 


AC007175 


AC007175 


Drosophil 


c 


41 


64 


.6 


8. 


0 


305150 


3 


AE003453 


AE003453 


Drosophil 




42 


64 


.4 


8. 


0 


1052 


3 


AY061343 


AY061343 


Drosophil 




43 


58 


.2 


7. 


2 


155623 


5 


AL929239 


AL929239 


Zebraf ish 




44 


58 


.2 


7. 


2 


239459 


2 


BX322568 


BX322568 


Danio rer 


c 


45 


54 


.6 


6. 


7 


214455 


2 


AC118451 


AC118451 


Rattus no 



ALIGNMENTS 



RESULT 1 
BD243134 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



BD243134 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 

COMMENT 



_ 810 bp DNA linear PAT 17-JUL-2003 

6-Protein-bound receptor-like protein, polynucleotide encoded by 
it, and method of use thereof. 
BD243134 

BD243134 .1 GI : 33052904 
JP 2002527064-A/l. 
Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 810) 

Ozenberger,B.A. , Ka j kowski, E.M. and Lo,C.H.F. 

6-Protein-bound receptor-like protein, polynucleotide encoded by 
it, and method of use thereof 
Patent: JP 2002527064-A 1 27-AUG-2002; 
AMERICAN HOME PRODUCTS CORP 
OS Homo sapiens (human) 
JP 2002527064-A/l 
27-AUG-2002 

13-OCT-1999 JP 2000576015 
13-OCT-1998 US 60/104104 

BRADLEY ALTON OZENBERGER, EILEEN MARIE KAJKOWSKI , CHING HSIUNG 
FREDERICK LO 

C12N15/09,A61K45/00,A61P43/00,C07K14/705,C12N1/15,C12N1/19, PC 
C12N1/21, 

PC C12N5/10, C12Ql/02,Cl2Ql/68,G01N33/15,G01N33/50,G01N33/53, PC 
G01N33/566, 

PC C1-2N1-5-/-007C-1-2N5-/-00 

CC 6-Protein-bound receptor-like protein, polynucleotide encoded 

CC by it, and 

CC method of use thereof 

FH Key Location/Qualifiers 

FT source 1. .810 

FT /organism= 1 Homo sapiens (human) 1 . 

FEATURES Location/Qualifiers 
source 1. .810 

/organism="Homo sapiens" 
/mol__type= ,! genomic DNA" 



PN 
PD 
PF 
PR 
PI 
PI 
PC 



ORIGIN 



/db xref="taxon:9606" 



Query Match 100.0%; Score 810; DB 6; Length 810; 

Best Local Similarity 100.0%; Pred. No. 2.2e-206; 

Matches 810; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

ATGCATATTTTAAAAGGGTCTCCCAAT GT GATT CCACGGGCT CACGGGCAGAAGAACACG 60 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I 
AT GC AT AT TT TAAAAGGGT CT C C CAAT GT GAT T C C AC GGGCT CACGGGC AGAAGAACAC G 60 

C GAAGAGAC GGAACT GGC CT CT ATC CT AT GCGAGGT C CCT T T AAGAACCT C GC C CT GTT G 120 

| M | | || | | I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I 

C GAAGAGAC GGAACT GGC CT CT AT C CT AT GCGAGGT C CCTT TAAGAACCT C GC C CT GTT G 120 
CCCTTCTCCCTCCCGCTCCTGGGCGGAGGCGGAAGCGGAAGTGGCGAGAAAGTGTCGGTC 180 

M I I I I I I I M I I I I M I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I M I I I I I II I I 

CCCTTCTCCCTCCCGCTCCTGGGCGGAGGCGGAAGCGGAAGTGGCGAGAAAGTGTCGGTC 180 

TCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGAGGCCGTGACGGCCAGA 24 0 

| I I M I I I I I I I I I I I I M I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 

TCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGAGGCCGTGACGGCCAGA 240 

CTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCTGTTGCCACC 300 

| | || | | | | | I || I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
CTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCTGTTGCCACC 300 

TCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCAAAGTGGGACAATATATTTGT 360 

| | | M I I I II I I I II I I I I I M I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
T C C GC CGG GGGC GAGGAGT C GCTTAAGT GCGAGGAC CT CAAAGT GGGACAAT ATATT T GT 360 

AAAGAT C CAAAAATAAAT GAC GCT AC GCAAGAAC C AGT TAACT GT ACAAACT ACACAGCT 420 
| | I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II 
AAAGATCCAAAAATAAATGACGCTACGCAAGAACCAGTTAACTGTACAAACTACACAGCT 420 

CAT GTTT C CT GT T T T C CAGCAC C CAAC AT AACTT GTAAGGATT C CAGT GGCAAT GAAAC A 4 80 
I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I 
CAT GTTT CCT GTTTT CCAGCACCCAACAT AACTT GTAAGGATTCCAGTGGCAAT GAAACA 480 

CAT T TT ACT GGGAAC GAAGT T GGTT TTT T CAAGC C CAT AT CT T GC C GAAAT GTAAAT GGC 540 
| I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I II I I I I I I I 
CAT T TT ACT GGGAAC GAAGT T GGT T TTTTCAAGC C CAT AT CT T GC CGAAAT GTAAAT GGC 540 

TATTCCTACAAAGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGA 600 

I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I 
TATTCCTACAAAGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGA 600 

TTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGA 660 

I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I M I I I I I II I I I I I I I I II I I I I I 
TTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGA 660 

AT T GGGAGC CTAAT T GATT T CAT T CT T ATT T CAAT G CAGAT T GTT GGACCTT C AGAT GGA 720 

I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

AT T GGGAGC CTAAT T GATT T CAT T CT TAT T T CAATGC AGAT T GT T GGACCT T CAGAT GGA 720 
AGTAGTTACATT ATAGATTACTAT GGAACCAGACTT ACAAGACT GAGTATTACTAAT GAA 780 

I I M I I I I I I I M I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



Ov 


l 


JJD 


i 


Ov 
yy 


61 


JJD 


61 


Ov 

yy 


121 


UD 


191 


Ov 

vy 


181 


JJD 


1 ft 1 


Ov 

vy 


241 


JJD 


941 

Z. rt -L 




301 


JJD 




vy 


361 


Db 


361 


Ov 
vy 


421 




421 


Ov 


481 


Db 


481 


Qy 


541 


Db 


541 


Qy 


601 


Db 


601 


Qy 


661 


Db 


661 


Qy 


721 



Db 



721 AGT AGT T AC AT T AT AGAT TACT AT GGAAC CAGACTT ACAAGACT GAGT AT T ACTAAT GAA 780 



Qy 



Db 



781 ACATT T AGAAAAAC GCAAT TAT AT C CATAA 810 

I I I I I I I I I I I I I I I I I I I I I I I I M > I I I 
781 ACAT T T AGAAAAAC G C AAT TAT AT C CATAA 810 



RESULT 2 
AF353990 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

FEATURES 

source 



gene 



CDS 



AF353990 1246 bp mRNA linear PRI 29-MAY-2001 

Homo sapiens beta-amyloid binding protein precursor (BBP) mRNA, 
complete cds . 
AF353990 

AF353990.1 GI : 13625458 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1 to 1246) 

Kajkowski,E.M., Lo,C.F., Ning,X., Walker, S., Sofia, H. J., Wang,W., 
Edris,W., Chanda,P., Wagner, E., Vile,S., Ryan,K., 

McHendry-Rinde,B. , Smith, S.C., Wood, A. , Rhodes, K. J., Kennedy, J. D . , 
Bard, J. , Jacobsen, J. S . and Ozenberger, B. A. 

beta -Amyloid peptide-induced apoptosis regulated by a novel 

protein containing a g protein activation module 

J. Biol. Chem. 276 (22), 18748-18756 (2001) 

21276355 

11278849 

2 (bases 1 to 1246) 

Ozenberger, B. A. , Ka j kowski, E . , Jacobsen, J. S . , Bard, J. and Walker, S. 
Direct Submission 

Submitted (27-FEB-2001) Wyeth Neuroscience, CN 8000, Princeton, NJ 

08543, USA 

Location/Qualifiers 
1. .1246 

/organism="Homo sapiens" 
/mol_type="mRNA" 
/db_xref="taxon: 9606" 
/ ch r omo s ome = " 1 " 
1. .1246 
/gene-"BBP" 
304. .927 
/gene="BBP" 

/note="membrane-associated glycoprotein" 
/codon__start=l 

^^^^^^"b-eta^amyl-oi-d-binding-protein-precursor^ 1 

/protein_id="AAK35064 . 1" 
/db_xref-"GI : 13625459" 

/translatio n= " MAAAW P S G P SAP EAVT ARLVGVLW FVS VTT G PWGAVAT S AGGEE 
SLKCEDLKVGQYICKDPKINDATQEPWCTNYTAHVSCFPAPNITCKDSSGNETHFTG 
NEVGFFKPISCRNVNGYSYKVAVALSLFLGWLGADRFYLGYPALGLLKFCTVGFCGIG 
SLIDFILISMQIVGPSDGSSYIIDYYGTRLTRLSITNETFRKTQLYP" 

304. .414 
/gene="BBP" 
646. .711 
/gene="BBP" 



sig_peptide 
misc feature 



/note="Region: transmembrane domain" 
misc_f eature 751. .825 

/gene="BBP" 

/note="Region: transmembrane domain" 

ORIGIN 

Query Match 100.0%; Score 810; DB 9; Length 1246; 

Best Local Similarity 100.0%; Pred. No. 2.4e-206; 

Matches 810; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 AT GCAT AT T TTAAAAGG GT CT CC CAAT GT GATT C CAC GGGCT CAC GGGCAGAAGAACAC G 60 

| | | | M I I M I I I I I I I I I I I I M I I I I I I I I I I I I M I I I I I I I I I I I I I I M I I I I I I 
Db H8 AT GCAT AT T T TAAAAGGGTCT CC CAAT GT GATT C CAC GGGCT CAC GGGCAGAAGAACAC G 177 

Qy 61 CGT^AGAGACGGAACTGGCCTCTATCCTATGCGAGGTCCCTTTAAGAACCTCGCCCTGTTG 120 

| | | | | | | | M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I 
Db 178 CGAAGAGACGGAACTGGCCTCTATCCTATGCGAGGTCCCTTTAAGAACCTCGCCCTGTTG 237 

Qy 121 CCCTTCTCCCTCCCGCTCCTGGGCGGAGGCGGAAGCGGAAGTGGCGAGAAAGTGTCGGTC 180 

| | | | | M | I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I M II I I I I I I I I I 
Db 238 CCCTTCTCCCTCCCGCTCCTGGGCGGAGGCGGAAGCGGAAGTGGCGAGAAAGTGTCGGTC 297 

Qy 181 TCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGAGGCCGTGACGGCCAGA 240 

| M I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I M I I II I I I I I I I I M 
Db 298 TCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGAGGCCGTGACGGCCAGA 357 

Qy 241 CTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCTGTTGCCACC 300 

| | | | | || | I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 358 CTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCTGTTGCCACC 417 

Q y 301 TCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCAAAGTGGGACAATATATTTGT 360 

| || | | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I 
Db 418 TCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCAAAGTGGGACAATATATTTGT 477 

Qy 361 AAAGAT CCAAAAATAAAT GACGCTACGCAAGAACCAGTTAACTGTACAAACTACACAGCT 42 0 

| | | | M | | I I I I I I I I I I I I I I I I II I I I I I I I I I I I M I I I I I I I I I I I I I I I 

Db 478 AAAGAT C CAAAAATAAAT GAC G CT AC GCAAGAACCAGTT AACT GT ACAAACT ACAC AGCT 537 

Qy 421 CATGTTTCCTGTTTTCCAGCACCCAACATAACTTGTAAGGATTCCAGTGGCAATGAAACA 480 

| | | | | I I I I I I I I M I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 538 CATGT TTCCTGTTTTC CAGCAC C CAACAT AACT T GT AAGGAT T CCAGT GGCAAT GAAACA 597 

Qy 481 CATT T TACT GGGAAC GAAGTT GGT T TTT T CAAGC C CAT AT CT T GC C GAAAT GT AAAT GGC 540 

| | | M I I I I I I I I I I I I I I I I I I I I I I I M I I I I II I I I I I I I I I I I I I I I I I M I I I I I 
Db 598 CATT T TACT GGGAAC GAAGTT GGT t TT T T CAAGC C CAT ATCT T GC C GAAAT GTAAAT GGC 657 



- Q y 54 1 — TATT CCTACAAAGT GGCAGTCGCATTGT CTCTTTTT CTT GGATGGTTGGGAGCAGAT-GGA— 6 0 0- 

| | I I I I I II I I I I I I M I I I I I I II I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I II 
Db 658 TAT T C CT ACAAAGT GGCAGT C GCATT GT CTCTTTTTCTT GGAT GGT T GGGAGCAGAT C GA 717 

Qy 601 TTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGA 660 

| | | | | || | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I M I M I I I II I I 
Db 718 TTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGA 777 

Qy 661 ATT GGGAGCCTAATTGATTT CATT CTT ATTT CAATGCAGATT GTTGGAC CTT CAGATGGA 720 

| || M | | | | | | || I M I I I II I I I I I M I I I I II I I I I II I I I I I I I I I I I I I I M I I I I 

Db 778 ATT GGGAGC CTAAT T GAT TT CAT T CT T ATT T CAAT GC AGAT T GT T GGACCT T CAGAT GGA 837 



Qy 721 AGT AGT TACAT T AT AGATTACT AT G GAACCAGACTT ACAAGACT GAGT AT TACTAAT GAA 780 

I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I M I | I I | I I | I I I I I I I I I I 
Db 838 AGTAGTTACATTATAGATTACTATGGTVACCAGACTTACAAGACTGAGTATTACTAATGAA 897 

Qy 781 ACAT T T AGAAAAAC GC AATT AT AT C C AT AA 810 

I I I I I I I I I I I I I I I I I I I I M I I I I I I I I 
Db 898 ACAT T T AGAAAAAC GCAAT TAT AT C C AT AA 927 



RESULT 3 
BC029486 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
MEDLINE 
PUBMED 
"REFERENCE- 
AUTHORS 
TITLE 
JOURNAL 



REMARK 
COMMENT 



BC029486 984 bp mRNA linear PRI 06-OCT-2003 

Homo sapiens beta-amyloid binding protein precursor, mRNA (cDNA 
clone MGC: 32941 IMAGE: 5271098 ) , complete cds . 
BC029486 

BC029486. 1 GI:20809565 
MGC. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 984) 

Strausberg, R. L. , Feingold, E. A. , Grouse, L.H., Derge,J.G., 
Klausner, R. D. , Collins, F. S . , Wagner, L., Shenmen, C .M. , Schuler, G . D . , 
Altschul,S.F. , Zeeberg,B., Buetow,K.H., Schaefer, C . F. , Bhat,N.K., 
Hopkins, R. F. , Jordan, H., Moore, T . , Max,S.I., Wang, J. , Hsieh,F., 
Diatchenko, L. , Marusina,K., Farmer, A. A. , Rubin, G.M., Hong,L., 
Stapleton,M. , Soares,M.B., Bonaldo, M. F. , Casavant, T . L . , 
Scheetz, T . E. , Browns tein, M. J . , Usdin, T . B . , Toshiyuki, S . , 
Carninci,P., Prange,C, Raha, S.S., Loquellano, N. A. , Peters , G. J. , 
Abramson, R. D. , Mullahy, S.J. , Bosak,S.A. , McEwan, P.J. , 
McKernan, K. J. , Malek,J.A., Gunaratne, P . H. , Richards, S., 
Worley,K.C, Hale,S., Garcia, A.M., Gay,L.J., Hulyk,S.W., 
Villalon,D.K. , Muzny, D.M. , Sodergren, E . J. , Lu,X., Gibbs,R.A., 
Fahey,J., Helton, E., Ketteman,M., Madan,A. , Rodrigues, S . , 
Sanchez, A., Whiting, M. , Madan,A., Young, A. C, Shevchenko, Y. , 
Bouf fard, G. G. , Blakesley, R. W. , Touchman, J.W. , Green, E. D. , 
Dickson, M. C, , Rodriguez, A. C. , Grimwood, J., Schmutz,J., Myers, R.M., 
Butterf ield, Y. S . , Krzywinski,M. I . , Skalska,U., Smailus , D . E . , 
Schnerch,A., Schein,J.E., Jones, S.J. and Marra,M.A. 
Generation and initial analysis of more than 15,000 full-length 
human and mouse cDNA sequences 

Proc. Natl. Acad. Sci. U.S.A. 99 (26), 16899-16903 (2002) 

22388257 

12477932 

"2 (bases— l-to-984) 

Strausberg, R. 
Direct Submission 

Submitted ( 01-MAY-2002 ) National Institutes of Health, Mammalian 
Gene Collection (MGC) , Cancer Genomics Office, National Cancer 
Institute, 31 Center Drive, Room 11A03, Bethesda, MD 20892-2590, 
USA 

NIH-MGC Project URL: http://mgc.nci.nih.gov 
Contact: MGC help desk 
Email: cgapbs-r@mail.nih.gov 

Tissue Procurement: Miklos Palkovits, M.D., Ph.D. 



cDNA Library Preparation: Michael J. Brownstein (NHGRI) & Shiraki 
Toshiyuki and Piero Carninci (RIKEN) 

cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 

DNA Sequencing by: Sequencing Group at the Stanford Human Genome 

Center, Stanford University School of Medicine, Stanford, CA 94305 

Web site: http://www-shgc.stanford.edu 

Contact: (Dickson, Mark) mcd@paxil.stanford.edu 

Dickson, M., Schmutz, J., Grimwood, J., Rodriquez, A., and Myers, 
R. M. 

Clone distribution: MGC clone distribution information can be found 
through the I.M.A.G.E. Consortium/LLNL at: http://image.llnl.gov 
Series: IRAK Plate: 48 Row: b Column: 24 

This clone was selected for full length sequencing because it 
passed the following selection criteria: matched mRNA gi : 17738309. 
FEATURES Location/Qualif iers 

source 1. .984 

/organism="Homo sapiens" 
/mo l_t yp e= "mRNA " 
/db_xref="taxon:9606" 
/clone="MGC: 32941 IMAGE: 5271098" 
/tissue_type="Testis" 
/clone_lib= M NIH_MGC_97" 
/lab_host="DH10B" 
/note="Vector : pBluescript " 
gene 1. .984 

/gene="BBP M 

/db_xref="LocusID: 83941" 
CDS 22. .645 

/ codon_start=l 

/product="beta-amyloid binding protein precursor" 
/protein_id="AAH29486-l" 
/dbjcref ="GI : 20809566" 
/db_xref="LocusID: 83941" 

/translation^ 1 1 MAAAW P S GP SAP EAVT ARLVGVLW FVS VT TGPWGAVAT S AGGEE 
SLKCEDLKVGQYICKDPKINDATQEPVNCTNYTAHVSCFPAPNITCKDSSGNETHFTG 
NEVGFFKPISCRNVNGYSYKVAVALSLFLGWLGADRFYLGYPALGLLKFCTVGFCGIG 
SLIDFILISMQIVGPSDGSSYIIDYYGTRLTRLSITNETFRKTQLYP" 
misc_feature 385. .525 

/note="XynA; Region: Predicted membrane protein [Function 
unknown] " 

/db_xref="CDD:COG2314" 

ORIGIN 

Query Match 79.6%; Score 645; DB 9; Length 984; 

Best Local Similarity 100.0%; Pred. No. 4.8e-162; 



Matches 645 ; Conservative 0 ; — Mismatches 0 ; — Indels 0.; Gaps 0_;_ 

Qy 166 GAGAAAGT GT CGGT CT CCAAGAT GGCGGCCGCCT GGCCGT CT GGT CCGT CT GCT CCGGAG 225 

I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 GAGAAAGTGTCGGTCTCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGAG 60 

Qy 226 GCCGTGACGGCCAGACTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGG 28 5 

M I I M II M M I I II M I I I I I I M I II I I II I I I I I M M I M I I I I I I I I II I II I I 

Db 61 GCCGTGACGGCCAGACTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGG 120 



Qy 286 GGGGCTGTTGCCACCTCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCAAAGTG 345 



Db 


121 


Qv 


346 


Db 


181 


Ov 


406 


Db 


241 


Ov 


466 


Db 


301 


Ov 


526 


Db 


361 


Ov 


586 


Db 


421 


Ov 


646 


Db 


481 


Qy 


706 


Db 


541 


Qy 


766 


Db 


601 



| | | | | | I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I N 

GGGGCTGTTGCCACCTCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCAAAGTG 180 
GGACAATAT ATTT GTAAAGATCCAAAAATAAAT GACGCTACGCAAGAACCAGTTAACT GT 4 05 

I I | | | | M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I N 

GGACAAT AT ATTT GTAAAGAT CCAAAAATAAAT GACGCTACGCAAGAACCAGTTAACT GT 240 

ACAT^ACTACACAGCTCATGTTTCCTGTTTTCCAGCACCCAACATAACTTGTAAGGATTCC 4 65 

| | | | | | | | | M | | I I I I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I M I I I I I I 
ACAAACTAC ACAGCT C AT GT T T C CT GTTTT C CAGCAC CCAACATAACT TGT AAGGAT T CC 300 

AGT GGCAAT GAAACAC ATT T TACT GGGAAC GAAGTT GGT TT TT T CAAGC C CAT AT CT T GC 525 

| | | | | | | M I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I M I I I I I I I 
AGT GGCAAT GAAACAC ATTTTACT GGGAAC GAAGTT GGTTTTTTCAAGCCCAT AT CTTGC 360 

CGAAAT GTAAAT GGCT ATT C CTACAAAGT GGCAGTC GCAT T GT CT CTT TTT CT T GGAT GG 585 

| | | I M M I I I I I I I M I I I I I I I I I I I II I I I II I I I I I I I I I I I M I I I I I I I I I I I I 

C GAAAT GTAAAT GGCT AT T C CTACAAAGT GGCAGT C GC ATT GT CTCTTTTTCTT GGAT GG 420 

TTGGGAGCAGATCGATTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACT 645 

M I I I I I I I I I I I M M I II I I I I I I I I I I M I I II I I I I I I I I I I I I I I I I I I I I I I I I 
TTGGGAGCAGATCGATTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACT 48 0 

GTAGGGTTTT GT GGAATT GGGAGCCTAAT T GATTT C AT T CTT ATT T CAAT GC AGAT T GTT 705 

| | I I I M | I I II I I II I I I I M I I I I I I I I I I M I II I I I I I I I I I I I I I I I I I I I I I I I 

GTAGGGTTTTGT GGAATT GGGAGCCTAATT GATTT CATT CTT ATTTCAATGCAGATT GTT 54 0 

GGACCTT CAGAT GGAAGTAGTTACAT TAT AGAT TACT AT GGAACCAGACTTACAAGACTG 765 
| | | | | M I I I I I II I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I M I I I I I I M I I 
GGACCTT CAGAT GGAAGTAGTTACATTATAGATTACTATGGAACCAGACTTACAAGACT G 600 

AGT AT T ACT AAT GAAAC AT TT AGAAAAAC G CAAT TAT AT C C AT AA 810 
I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
AGT AT TACT AAT GAAAC AT T T AGAAAAAC GCAAT TAT AT CC AT AA 645 



RESULT 4 
BD139411 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



BD139411 970 bp DNA linear PAT 18-SEP-2002 

Extended cDNA of secretory protein. 

BD139411 

BD139411.1 GI: 23234356 
JP 2002508182-A/163. 
Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

-Mammaiia;— Eutheria;— Primates;-eatar-r-hi-n-i-;— Horriini-dae;— Homo 

1 (bases 1 to 970) 

Bougueleret,L. , Duclert,A. and Edwards , J. B . D .M. 
Extended cDNA of secretory protein 
Patent: JP 2002508182-A 163 19-MAR-2002; 
GEN SET 

OS Homo sapiens (human) 
PN JP 2002508182-A/163 
PD 19-MAR-2002 
PF 17-DEC-1998 JP 2000539136 

PR 17-DEC-1997 US 60/069957 , 09-FEB-1998 US 



60/074121 PR 



13-APR-1998 US 60/081563, 10-AUG-1998 US 60/096116 PI LYDIE 
BOUGUELERET , AYMERIC DUCLERT, JEAN BAPTISTE DUMAS MILNE PI EDWARDS 
PC C12N15/09,C12N15/09,C07K14/47,C07K16/18,C12N1/15,C12N1/19, PC 
C12N1/21, 

PC C12N5/10,C12P21/02,C12Q1/68,C12N15/00,C12N5/00,C12N15/00 CC 
Von Heijne matrix 
CC score 5.5 
CC seq LVGVLWFVSVTTG/PW 
FH Key Location/Qualif iers 

FT CDS 12. .4 97 

FT sig_peptide 12. .104 

FT polyA__signal 935. .940 
FT polyA_site 955. .967. 

FEATURES Location/Qualifiers 
source 1. .970 

/organism="Homo sapiens" 

/mo l_type= "genomic DNA" 

/db_xref="taxon: 9606" 

ORIGIN 

Query Match 74.4%; Score 602.8; DB 6; Length 970; 

Best Local Similarity 98.4%; Pred. No. le-150; 

Matches 624; Conservative 5; Mismatches 3; Indels 2; Gaps 2; 

Qy 177 GGTCTCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGAGGCCGTGACGGC 236 

I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I II : I I I I I I I I I I I I I I I II I I I I I 
Db 2 GGTCTCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGKCTGCTCCGGAGGCCGTGACGGC 61 

Qy 237 CAGACTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCTGTTGC 296 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I II I I M 
Db 62 CAGACTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCTGTTGC 121 

Qy 297 CACCT CC GC C GGG GGCGAGGAGT C GCT TAAGT GCGAGGAC CT CAAAGTGGGACAAT ATAT 356 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II 
Db 122 CACCTCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCAAAGTGGGACAATATAT 181 

Qy 357 TTGTAAAGAT CCAAAAATAAAT GACGCTACGCAAGAACCAGTTAACTGTACAAACTACAC 416 

I I II I I I I I I I I I I I II I I I I I II I I I I I I II II I I 1 I I I II I I I II I I I I I I I I I I I I I 
Db 182 TTGTAAAGAT C CAAAAAT AAAT GAC GCT AC GCAAGAACCAGTT AACTGT ACAAACT ACAC 241 

Qy 417 AGCTCATGTTTCCTGTTTTCCAGCACCCAACATAACTTGTAAGGATTCCAGTGGCAATGA 476 

I I II I I I I I I I I II I II I I I I I I I I I I I I I I M I I I I I I I I I I I II I I I I I I I I I I I I I I 
Db 242 AGCTCATGTTTCCTGTTTTCCAGCACCCAACATAACTTGTAAGGATTCCAGTGGCAATGA 301 

Qy 477 AACACATTTTACTGGGAACGAAGTTGGTTTTTTCAAGCCCATATCTTGCCGAAATGTAAA 536 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I 
Db 30 2 — AACACATTTTACT GGGAACGAAGTT GGT-T-T-T-T-TCAAGC 



Qy 537 TGGCTATTCCTACAAAGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGA 596 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I I I I I I I 
Db 362 TGGCTATTCCTACAATG-AGCAGTCGCA-TGTCTCTTTTTCTTGGATGGTTGGGAGCAGA 419 

Qy 597 TCGATTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTG 656 

I II I I I I I I I I I I I I I I II I I I I I I I I I I I I I M I I I I I : I I I : I I I I I I I I I I I I I I : I 
Db 420 TCGATTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAABTTTYGCACTGTAGGGTTTKG 479 



Qy 



657 



TGGAATTGGGAGCCTAATTGATTTCATTCTTATTTCAATGCAGATTGTTGGACCTTCAGA 716 



I I II I I I I I I I I I I I I I I I I > I II I I I : I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 
Db 480 TGGAATTGGGAGCCTAATTGATTTCATYCTTATTTCAATGCAGATTGTTGGACCTTCAAA 539 

Qy 717 T GGAAGTAGT T ACAT T AT AGATT ACT AT GGAACCAGACTT ACAAGACT GAGT AT TACT AA 776 

I I I I I II I I II I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 
Db 540 T GGAAGT AGTT ACAT T ATAGAT T ACT AT GGAAC CAGACTT ACAAGACT GAGT AT TACT AA 599 

Qy 777 T GAAACATTTAGAAAAAC GCAAT TATAT C CATAA 810 

I I I I II I I I I I I I I I I I I I I I I I I II II I I I I I I 
Db 600 TGAAACATTTAGAAAAACGCAATTATATCCATAA 633 



RESULT 5 
BD265227 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 

JOURNAL 

COMMENT 



BD265227 508 bp DNA linear PAT 17-JUL-2003 

Compounds for immunotherapy and diagnosis of colonic cancer and 
method of using the same. 
BD265227 

BD265227.1 GI:33074995 
JP 2002533082-A/225. 
Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 508) 

Xu,J., Lodes, M. J, , Secrist,H., Benson, D.R., Meagher ,M. J. , Stolk,J w 
Wang,T, and Yuqiu,J. 

Compounds for immunotherapy and diagnosis of colonic cancer and 
method of using the same 

Patent: JP 2002533082-A 225 08-OCT-2002; 
CORIXA CORP 

OS Homo sapiens (human) 

PN JP 2002533082-A/225 

PD 08-OCT-2002 

PF 23-DEC-1999 JP 2000589697 

PR 23-DEC-1998 US 09/221298 , 02- JUL-1999 US 09/347496 PR 
22-SEP-1999 US 09/401064, 19-NOV-1999 US 09/444242 PR 
02-DEC-1999 US 09/454150 

PI JIANGCHUN XU, MICHAEL J LODES , HEATHER SECRIST, DARIN R BENSON, 
PI MADELEINE JOY MEAGHER, JOHN STOLK, TONGTONG WANG, JIANG YUQIU PC 

C12N15/09, A61K31/711,A61K35/14,A61K38/00,A61K39/00,A61K39/395, PC 

A61K39/395, 

PC A61P35/00,C07K14/47,C07K16/18,C07K19/00,C12N1/15,C12N1/19, PC 

C12N1/21, 
PC 

C12N5/06,C12N5/10,C12Q1/02,C12Q1/68,G01N33/53,G01N33/53,G01N33/ PC 

566; 

PC 
PC 
CC 

cc 
cc 

FH 
FT 
FT 



FEATURES 

source 



G01N33/574,G01N33/577,G0lN33/58,C12N15/00,C12N5/00,Cl2N5/00, 
A61K37/02 

Compounds for immunotherapy and diagnosis of colonic cancer 

and method of 
using the same 

Key Location/Qualifiers 
source 1. .508 

/organism= 1 Homo sapiens (human) ' . 
Location/ Qualifiers 
1. .508 



/organism="Homo sapiens" 
/mol_type=" genomic DNA" 
/db_xref="taxon: 9606" 

ORIGIN 

Query Match 61.6%; Score 499; DB 6; Length 508; 

Best Local Similarity 100.0%; Pred. No. 6.9e-123; 

Matches 499; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 312 CGAGGAGTCGCTTAAGTGCGAGGACCTCAAAGTGGGACAATATATTTGTAAAGATCCAAA 371 

I I I I I I I I I I I I I I I I I I I I! I i I I I i I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I 
Db 1 CGAGGAGT C GCTTAAGT GC GAGGAC CT CAAAGT GGGACAATAT ATTT GT AAAGAT C CAAA 60 

Qy 372 AATAAATGACGCTACGCAAGAACCAGTTAACT GTACAAACTACACAGCT CAT GTTT CCTG 431 

I I II II I I I I I I I II I I I I I I I I M I I I I I I I I I I I I I I M I I II I I I I I I II M II I I I 

Db 61 AAT AAAT GAC GCT ACGCAAGAAC CAGTTAACT GTACAAACTACACAGCT CAT GT T T C CTG 120 

Qy 432 TT T T C CAGCAC CCAACATAACTT GTAAGGAT T C CAGT GGCAATGAAACACATT T TACT GG 491 

II I I M M I M M I I I I I I M M M I M M I II I II II I I I M I M I M I M I M I I I M 

Db 121 TTTTCCAGCACCCAACATAACTTGTAAGGATTCCAGTGGCAATGAAACACATTTTACTGG 180 

Qy 492 GAACGAAGTTGGTTTTTTCAAGCCCATATCTTGCCGAAATGTAAATGGCTATTCCTACAA 551 

I M M I II M II M I II M II I M M I I II M II II II I I I M I I II M M I I II II I I I 
Db 181 GAAC GAAGT T GGTTTTT T CAAGCC CATAT CTT GC C GAAAT GT AAAT GGCT AT T CCTACAA 240 

Qy 552 AGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGATTTTACCTTGG 611 

I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I II II I I I I I I I I I I 
Db 241 AGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGATTTTACCTTGG 300 

Qy 612 ATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCT 671 

I I I II I II I II II II II I II I I I I I I I I I II I II I I I I I I I I I I I I II I II I II I I I I I I 
Db 301 ATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCT 360 

Qy 672 AATT GATTT CATTCTTATTT CAAT GCAGATTGTT GGACCTTCAGATGGAAGTAGTTACAT 731 

I I I I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 AATT GAT TT CATT CTT ATTT CAAT GCAGAT T GT T GGAC CT TCAGAT GGAAGTAGTTACAT 420 

Qy 732 T ATAGAT TACTAT GGAAC C AGACT TACAAGACT GAGT AT T ACTAATGAAACATT TAGAAA 791 

I I I I I I I I I I I I II II I I I I II I M II I I I I I I I I I I I I I I I I I I I I I I I I I II I I M M 
Db 421 T ATAGAT TACTAT GGAAC C AGACT TACAAGACT GAGTAT TACT AAT GAAACATT TAGAAA 480 

Qy 792 AACGCAATTATATCCATAA 810 

I I I I I I I I I I I I I I II I I I 
Db 481 AACGCAATTATATCCATAA 4 99 



RESULT-6 

BD265239 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



BD265239 508 bp DNA linear PAT 17-JUL-2003 

Compounds for immunotherapy and diagnosis of colonic cancer and 
method of using the same. 
BD265239 

BD265239 .1 GI : 33075007 
JP 2002533082-A/237. 
Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 



REFERENCE 
AUTHORS 

TITLE 

JOURNAL 

COMMENT 



Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo, 
1 (bases 1 to 508) 

Xu,J., Lodes, M. J., Secrist,H., Benson, D.R., Meagher, M. J. , Stolk,J., 
Wang,T. and Yuqiu,J. 

Compounds for immunotherapy and diagnosis of colonic cancer and 
method of using the same 

Patent: JP 2002533082-A 237 08-OCT-2002; 
CORIXA CORP 

OS Homo sapiens (human) 
PN JP 2002533082-A/237 
PD 08-OCT-2002 
PF 23-DEC-1999 JP 2000589697 

PR 23-DEC-1998 US 09/221298 , 02- JUL-1999 US 09/347496 PR 
22-SEP-1999 US 09/401064 , 19-NOV-1999 US 09/444242 PR 
02-DEC-1999 US 09/454150 

PI JIANGCHUN XU, MICHAEL J LODES , HEATHER SECRIST, DARIN R BENSON, 
PI MADELEINE JOY MEAGHER, JOHN STOLK, TONGTONG WANG, JIANG YUQIU PC 
C12N15/09,A6lK31/711,A61K35/14,A61K38/00,A61K39/00,A61K39/395, PC 
A61K39/395, 

PC A61P35/00,C07K14/47,C07K16/18,C07K19/00,C12N1/15,C12N1/19, PC 

C12N1/21, 
PC 

C12N5/06,C12N5/10,C12Q1/02,C12Q1/68,G01N33/53,G01N33/53,G01N33/ PC 
566, 

PC G01N33/574 / G01N33/577,G01N33/58,Cl2N15/00,C12N5/00,C12N5/00 / 
PC A61K37/02 

CC Compounds for immunotherapy and diagnosis of colonic cancer 
CC and method of 

CC using the same 

FH Key Location/Qualif iers 

FT source 1. .508 

FT /organism^ 1 Homo sapiens (human) 1 . 

Location/ Qualifiers 
1, .508 
/organism="Homo sapiens" 
/mol_type=" genomic DMA" 
/db_xref="taxon: 9606" 

ORIGIN 

Query Match 61.6%; Score 499; DB 6; Length 508; 

Best Local Similarity 100.0%; Pred. No. 6.9e-123; 
Matches 499; Conservative 0; Mismatches 0; Indels 



FEATURES 

source 



0; Gaps 



0; 



Qy 

Db 



Qy 
Db 

Qy 

Db 



312 CGAGGAGT CGCTTAAGTGCGAGGACCTCAAAGT GGGACAATATATTT GTAAAGAT CCAAA 371 

M M I I I II I I M I II I I I M M M M I I M M I I II M M I II I I M II II I I I I I M I 

1 CGAGGAGTCGCTTAAGT GCGAGGAC CTCAAAGTGGGACAAT AT ATTT GTAAAGAT CCAAA 60 



372 AATAAAT GAC GCTACGCAAGAACCAGTTAACT GTACAAAC TAC ACAGC TCAT GTT T CCTG 431 

I I I I I I I I M I I I I I I I II I I I I I I I I I M I I I I I I II I I I I I I I I II I I I I I I II I I I I 

61 AATAAAT GAC GCT AC GCAAGAAC C AGT TAACT GTACAAAC TACACAGCTCAT GT T T CCT G 120 

432 T T T T CCAGCACCCAACAT AACT T GT AAGGATT CCAGT GGCAATGAAAC ACAT T T TACT GG 4 91 

I I I I I I II I I I I I I I M I II I I I I I II II I I I I I I I I I I I I I I II I M I I I I I I I II I I I 

121 T T T T C CAGCAC C C AAC AT AACT T GTAAG GAT T C C AGT GGC AAT GAAAC ACAT T T TAC T GG 180 



Qy 



492 GAACGAAGT T GGTTT T T T CAAGCC CATAT CTT G CCGAAAT GT AAAT GGCT AT T C CT ACAA 551 
II II I I I II II II I I M M I I I I M I I I I M II I II II I I I II II II I I M I I II II I I I 



Db 


181 


Qy 


552 


Db 


241 


QY 


612 


Db 


301 


Qy 


672 


Db 


361 


Qy 


732 


Db 


421 


Qy 


792 


Db 


481 



181 GAACGAAGTTGGTTTTTTCAAGCCCATATCTTGCCGAAATGTAAATGGCTATTCCTACAA 240 

AGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGATTTTACCTTGG 611 

I M I I I I I I I I I II I I I I I I I I M I I I I I I I I I I I II I I I I I I I I I I I I M I I I I I I I I I 
AGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGATTTTACCTTGG 300 

ATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCT 67 1 

I | | | | | | | I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 
ATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCT 360 

AATT GAT TT CAT T CTT ATT TCAAT GCAGAT T GT T GGAC CT TCAGAT GGAAGT AGTT ACAT 731 
I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
AATT GATT T CATTCT TAT T TCAAT GCAGATT GTTGGACCT T CAGAT GGAAGT AGT T ACAT 420 

TATAGAT T ACT AT GGAAC CAGACT T ACAAGACT GAGT ATT ACTAAT GAAACATT T AGAAA 791 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 
T AT AGATT ACT ATGGAAC CAGACT T ACAAGACT GAGTAT TACT AAT GAAACAT TT AGAAA 4 80 

AACGCAAT TAT AT C CAT AA 810 
I I I I I I I I I I I I I I I I I I I 
AAC GCAAT T AT AT CCATAA 499 



RESULT 7 
AR401213 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 

REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
FEATURES 

source 



ORIGIN 



linear PAT 18-DEC-2003 



AR401213 508 bp DNA 

Sequence 233 from patent US 6623923. 
AR401213 

AR401213.1 GI: 40148513 



Unknown. 

Unknown. 

Unclassified. 

1 (bases 1 to 508) 

Xu,J., Lodes, M. J., Secrist,H., Meagher, M. J . , Stolk,J., Benson, D.R. 
and Wang,T. 

Compounds for immunotherapy and diagnosis of colon cancer and 
methods for their use 

Patent: US 6623923-A 233 23-SEP-2003; 
Location/Qualifiers 
1. .508 

/ o rgani sm= " unknown " 
/mol_type=" genomic DNA" 



Query Match 
~Bes t — Loc al — S imiia r i t y 
Matches 4 99; 



61.6%; Score 499; DB 6; Length 508; 

100^0%; Pred— No— 6^9e-12-3; 

Conservative 0; Mismatches 0; Indels 



0 ; Gaps 



0; 



Qy 



Db 



312 CGAGGAGTCGCTTAAGT GCGAGGACCT CAAAGTGGGACAATATATTTGTAAAGAT CCAAA 371 
| | | | I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
1 CGAGGAGT C GCT T AAGT GC GAGGACCT CAAAGT GGGACAATAT AT T T GT AAAGAT CCAAA 60 



Qy 



Db 



372 



AAT AAAT GACGCT ACGCAAGAAC CAGTT AACT GT ACAAACT ACACAGCT CAT GT T T C CTG 431 
| | | | | | | | | I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I 
61 AAT AAAT GACGCT AC GCAAGAAC CAGT TAACTGT ACAAACT AC ACAGCT CAT GT T T C CT G 120 



Qy 
Db 



432 
121 



491 
180 



Qy 4 92 GAAC GAAGTT GGT T T TT T CAAGCC CAT AT CT T GC C GAAATGTAAAT GGCT ATT CCTACAA 551 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 GAAC GAAGTT GGT T T TT T CAAGCC CAT AT CT T GCCGAAATGTAAAT GGCTATT CCTACAA 240 

Qy 552 AGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGATTTTACCTTGG 611 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
Db 241 AGT GGC AGT C GCAT TGTCTCTTTTTCTT GGAT GGT T GGGAGCAGAT CGATT TTAC CTT GG 300 

Qy 612 ATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCT 671 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 ATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCT 360 

Qy 672 AATT GATTTCATT CTTATTT CAAT GCAGATTGTT GGACCTT CAGAT GGAAGTAGTTACAT 731 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 AAT T GATTT CAT T CT T ATTT CAAT GCAGAT T GTT GGAC CTT CAGAT GGAAGT AGT TACAT 42 0 

Qy 732 T AT AGATTACTAT GGAAC CAGACT TACAAGACT GAGT AT T ACTAAT GAAACAT TT AGAAA 791 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 TAT AGATTACTAT GGAAC CAGACT TACAAGACTGAGT AT T ACTAAT GAAACATTT AGAAA 480 

Qy 792 AAC G CAAT TAT AT C C AT AA 810 

I I I I I I I I I I I I I I II I I I 
Db 4 81 AAC G CAAT TAT AT C C AT AA 499 



RESULT 8 
AR401225 

LOCUS AR401225 508 bp DNA linear PAT 18-DEC-2003 

DEFINITION Sequence 245 from patent US 6623923. 
ACCESSION AR401225 

VERSION AR401225.1 GI:40148525 

KEYWORDS 

SOURCE Unknown. 
ORGANISM Unknown. 

Unclassified. 
REFERENCE 1 (bases 1 to 508) 

AUTHORS Xu,J., Lodes, M. J. , Secrist,H., Meagher, M. J. , Stolk,J., Benson, D.R. 
and Wang,T. 

TITLE Compounds for immunotherapy and diagnosis of colon cancer and 

methods for their use 
JOURNAL Patent: US 6623923-A 245 23-SEP-2003; 
FEATURES Location/Qualifiers 

source 1^ — r50 8 

/ organism=" unknown" 
/mol_type=" genomic DNA" 

ORIGIN 

Query Match 61.6%; Score 499; DB 6; Length 508; 

Best Local Similarity 100.0%; Pred. No. 6.9e-123; 

Matches 499; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 312 CGAGGAGTCGCTTAAGT GCGAGGACCT CAAAGT GGGACAATATATTTGTAAAGAT CCAAA 371 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 



Db 


1 


Qy 


372 


Db 


61 


Qy 


432 


Db 


121 


Qy 


492 


Db 


181 


Qy 


552 


Db 


241 


Qy 


612 


Db 


301 


Qy 


672 


Db 


361 


Qy 


732 


Db 


421 


Qy 


792 


Db 


481 



1 CGAGGAGTCGCTTAAGTGCGAGGACCTCAAAGTGGGACAATATATTTGTAAAGATCCA7UV 60 
AAT AAAT GAC GCT AC GCAAGAAC C AGT TAACT GT ACAAACT ACAC AGCT CAT GTT T CCT G 431 

| I | | I I I II I I I I I I I I M I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I 

AAT AAAT GACGCT AC GCAAGAAC C AGTTAACT GT ACAAACT ACACAGCT CAT GT T T C CT G 12 0 

TTTTCCAGCACCCAACATAACTT GTAAGGATT CCAGT GGCAATGAAACACATTTTACT GG 491 

| | | | I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I 
TTTT CCAGCACCC7\ACATAACTT GTAAGGATTCCAGT GGCAATGAAACACATTTTACT GG 180 

GAACGAAGTTGGTTTTTTCAAGCCCATATCTTGCCGAAATGTAAATGGCTATTCCTACAA 551 
| | | M | | | I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I 
GAACGAAGT T GGT T TTT T CAAGC C CAT AT CTT GC CGAAAT GT AAAT GGCT AT T C CT ACAA 240 

AGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGATTTTACCTTGG 611 
| | | | | | I I I I I I I I I I I I I I I I I I I I I M I I I M I I I I I I I I I I I I I I I M I I I I I I I I I 
AGT GGCAGT C GCATT GT CT CT T T TTCT T GGATGGTT GGGAGCAGAT C GAT TT T AC CT T GG 300 

ATACCCTGCTTTGGGTTTGTT^AAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCT 67 1 

| | | | I I I I I I I I I I I I M I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I II I I I I I I I I 
ATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCT 360 

AATTGATTTCATTCTT ATTT CAATGCAGATTGTT GGACCTT CAGATGGAAGTAGTTACAT 73 1 

M I I I I I I I I I I I I I I I I I I I I I I M I I I I M I I I I I I I I I I I I I I I I I I I I I I 

AATT GAT T T CAT T CT T ATT T C AAT GC AGATT GT T GGACCT T CAGAT GGAAGTAGTT ACAT 420 



| | I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I II I I I I I I I I I I I II I I I I I I I I I I I I 
T ATAGATTACTAT GGAACCAGACTTACAAGACTGAGTATTACTAAT GAAACATTTAGAAA 480 

AACGCAATTATAT CCATAA 810 

I I I I I I I I I II I M I I I I I 

AAC G C AAT TAT AT C C AT AA 4 99 



RESULT 9 
AX192666 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



"REFERENCE 

AUTHORS 

TITLE 

JOURNAL 

FEATURES 

source 



linear PAT 15-AUG-2001 



AX192666 508 bp DNA 

Sequence 233 from Patent WO0149716. 
AX192666 

AX192666. 1 GI: 15210622 



Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

-1 

Xu,J., Lodes, M. J., Secrist,H., Benson, D.R., Meagher, M. J. , 
Stolk,J.A. , King,G.E., Wang,T. and Jiang, Y. 

Compounds for immunotherapy and diagnosis of colon cancer and 
methods for their use 

Patent: WO 0149716-A 233 12-JUL-2001; 
CORIXA CORPORATION (US) 

Location/Qualifiers 

1. .508 

/organism="Homo sapiens" 
/mol_type="unassigned DNA" 



ORIGIN 



/db xref= n taxon:9606" 



Query Match 61.6%; Score 499; DB 6; Length 508; 

Best Local Similarity 100.0%; Pred. No. 6.9e-123; 

Matches 499; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 312 C GAGGAGT C GCT T AAGT GC GAGGACCT CAAAGT GGGACAATAT ATT T GTAAAGAT C CAAA 371 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I 
Db 1 CGAGGAGTCGCTTAAGT GCGAGGACCT CAAAGT GGGACAAT AT ATTT GTAAAGAT CCAAA 60 

Qy 372 AAT AAAT GACGCT ACGCAAGAAC CAGT TAACT GTACAAACTACACAGCT CAT GTT T C CTG 431 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 
Db 61 AAT AAAT GAC GCT AC GCAAGAACCAGTT AACT GTACAAACTACACAGCT CAT GTTT C CTG 120 

Qy 432 T T TT CCAGCACC CAACATAACT T GTAAGGATT CCAGT GGCAAT GAAACAC ATT T TACT GG 491 

I I I I II I I II II I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 T T TTC CAGCAC C CAACATAACTT GTAAGGATT C CAGT GGCAAT GAAACAC ATTT TACT GG 180 

Qy 492 GAAC GAAGT TGGTTTTTT CAAGC CCAT AT CT T GC C GAAAT GTAAAT GGCT ATT C CT ACAA 551 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I II I I I I I 
Db 181 GAACGAAGTTGGTTTTTTCAAGCCCATATCTTGCCGAAATGTAAATGGCTATTCCTACAA 24 0 

Qy 552 AGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGATTTTACCTTGG 611 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 AGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGATTTTACCTTGG 300 

Qy 612 ATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCT 671 

I I I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 ATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCT 360 

Qy 672 AATTGATTTCATTCTTATTTCAATGCAGATTGTTGGACCTTCAGATGGAAGTAGTTACAT 731 

I I I I I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 AATT GATTT CATT CTT AT T T CAATGC AGAT T GT TGGACCTT CAGAT GGAAGTAGTT ACAT 420 

Qy 732 TATAGATTACTAT GGAACCAGACTTACAAGACTGAGTATTACTAAT GAAACATTTAGAAA 791 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I IN I I I 
Db 421 TATAGATTACTAT GGAACCAGACTTACAAGACT GAGTATTACTAATGAAACATTTAGAAA 48 0 

Qy 792 AACGCAATT AT AT C CATAA 810 

I I I I I I I I I I I I I I I I I I I 
Db 481 AACGCAATTATATCCATAA 499 



RESULT 10 
AX192678 

"LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



AX192678 



-508-bp- 



-DNA- 



-iine a r PAT— 15 - AUG— 2 0 0-1- 



Sequence 245 from Patent WO0149716. 
AX192678 

AX19267 8. 1 GI: 15210634 



Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Primates; 
1 

Xu,J., Lodes, M. J., Secrist,H. 



Craniata; Vertebrata; Euteleostomi ; 
Catarrhini; Hominidae; Homo. 

Benson, D. R. , Meagher, M. J. , 



Stolk,J.A. , King,G.E., Wang,T. and Jiang, Y. 
TITLE Compounds for immunotherapy and diagnosis of colon cancer and 

methods for their use 
JOURNAL Patent: WO 0149716-A 245 12- JUL-2001; 
CORIXA CORPORATION (US) 
FEATURES Location/Qualifiers 
source 1. .508 

/organism="Homo sapiens" 
/mol type="unassigned DNA" 
/db_xref="taxon: 9606" 

ORIGIN 

Query Match 61.6%; Score 499; DB 6; Length 508; 

Best Local Similarity 100.0%; Pred. No. 6.9e-123; 

Matches 499; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

312 CGAGGAGTCGCTTAAGT GCGAGGACCT CAAAGT GGGACAATAT ATTT GTAAAGAT CCAAA 371 
| | | | I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I I I M I I II 
1 C GAGGAGT C GCT T AAGT GC GAGGAC CT CAAAGT GGGACAATAT ATT T GTAAAGAT C CAAA 60 

372 AAT AAATGAC GCTACGCAAGAACCAGT T AACT GTACAAACTAC ACAGCT CAT GTTT CCT G 431 

I I I I I I I I I I I Ill I I I I I I I I II I I I I I I I I II I I I I I I I I I 

61 AATAAAT GACGCTAC GCAAGAAC CAGT T AACT GTACAAACTAC AC AGCT C ATGT TT C CT G 12 0 

432 T T T T CCAGCAC C CAACATAACT T GT AAGGATT C CAGT GGCAAT GAAAC ACATTTT ACT GG 491 

| | | | I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I I 

121 TTT T C CAGCACC CAACATAACT T GT AAGGATT C CAGT G GCAAT GAAACACAT TTTACT GG 18 0 

492 GAACGAAGTT GGT TTT TT CAAGC CC AT AT CTT GCC GAAAT GTAAAT GGCTATT C CT ACAA 551 

M | I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I 
181 GAACGAAGT T GGTT T T TT CAAGC C CAT AT CTT GCC GAAAT GTAAAT GGCTATT C CT ACAA 24 0 

552 AGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGATTTTACCTTGG 611 

I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 
241 AGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGATTTTACCTTGG 300 

612 ATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCT 671 

| | | | | I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I 
301 ATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCT 360 

672 AATT GATTTCATT CTTATTTCAAT GCAGATT GTT GGACCTT CAGATGGAAGTAGTTACAT 731 

I I I I I I M I I I I I I I I I I I I M I I I I M I I I I II I I I I I M I I II I I I I I I I 

361 AATT GATTTCATTCTTATTTCAAT GCAGATT GTT GGACCTTCAGATGGAAGTAGTTACAT 420 

732 T AT AGAT TACT AT G GAAC CAGACT T ACAAGACT GAGT AT TACTAAT GAAACATTT AGAAA 7 91 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I M I 
_ 4 2 1 — T AT AGATT ACTAT GGAAC CAGACTT ACAAGACTGAGTAT-T-AGT-AAT GAAAC AT-T-T- AGAAA— 4 80 

Qy 7 92 AAC GCAATTAT AT CCATAA 810 

I I I I I I I I I I I I I M I I I I 
Db 4 81 AACGCAATTATATCCATAA 499 



RESULT 11 
AF353993 

LOCUS AF353993 630 bp mRNA linear ROD 29-MAY-2001 

DEFINITION Mus musculus beta-amyloid binding protein (Bbp) mRNA, complete cds . 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

FEATURES 

source 



AF353993 
AF353993. 



GI: 13625464 



gene 



CDS 



misc feature 



misc feature 



Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 

1 (bases 1 to 630) 

Kajkowski,E.M. , Lo,C.F., Ning,X., Walker, S., Sofia, H. J., Wang,W. , 
Edris,W., Chanda,P., Wagner, E. , Vile,S., Ryan,K., 

McHendry-Rinde,B. , Smith, S.C., Wood, A. , Rhodes, K. J., Kennedy, J. D. , 
Bard, J., Jacobsen, J . S . and Ozenberger, B.A. 

beta -Amyloid peptide-induced apoptosis regulated by a novel 

protein containing a g protein activation module 

J. Biol. Chem. 276 (22), 18748-18756 (2001) 

21276355 

11278849 

2 (bases 1 to 630) 

Ozenberger, B.A. , Howland, D . S . , Lo,C.F. and She, Y. 
Direct Submission 

Submitted (27-FEB-2001) Wyeth Neuroscience, CN 8000, Princeton, NJ 
08543, USA 

Location/ Qualifiers 

1. .630 

/organism="Mus musculus" 

/mol_type="mRNA" 

/strain="BALB/c" 

/db_xref="taxon: 10090" 

1. .630 

/gene="Bbp" 

4. .630 

/gene="Bbp" 

/note="integral membrane glycoprotein" 
/codon_start=l 

/product="beta-amyloid binding protein" 
/protein_id="AAK35067 . 1" 
/db_xref-"GI: 13625465" 

/translatio n= " MAAAWP AGRAS PAAGP P GLL RTLW LVT VAAGHC GAAAS GAVGGE 
ETPKCEDLRVGQYICKEPKINDATQEPVNCTNYTAHVQCFPAPKITCKDLSGNETHFT 
GSEVGFLKPISCRNVNGYSYKVAVALSLFLGWLGADRFYLGYPALGLLKFCTVGFCGI 
GSLI DFI LI SMQI VGPSDGS S YI IDYYGTRLTRLS ITNETFRKTQLYP " 
349. .414 
/gene="Bbp" 

/note="Region: transmembrane domain" 
457. .528 
/gene="Bbp" 

— /note="Region-:— transmembrane-domain- 



ORIGIN 



Query Match 58.2%; 
Best Local Similarity 85.4%; 
Matches 538; Conservative 



Score 471.6; DB 10; Length 630; 
Pred. No. 1.6e-115; 
0; Mismatches 89; Indels 3; 



Gaps 



1; 



Qy 



Db 



184 AAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGAGGCCGTGACGGCCAGACTC 243 

M I I I I I I I I I I M II I I I I I I I I I I I I I I I I I I I I I I III 

1 AACATGGCGGCCGCCTGGCCCGCGGGTCGGGCTTCCCCAGCGGCGGGGCCTCCGGGCCTT 60 



Qy 


244 


Db 


61 


Qy 


301 


Db 


121 


Qy 


361 


Db 


181 


Qy 


421 


Db 


241 


Qy 


481 


Db 


301 


Qy 


541 


Db 


361 


Qy 


601 


Db 


421 


Qy 


661 


Db 


481 


Qy 


721 


Db 


541 


Qy 


781 


Db 


601 



GTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCTGTTGCCAC C 300 

I | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
CTCCGCACCCTGTGGCTCGTGACGGTCGCCGCGGGACACTGTGGGGCTGCTGCCTCTGGC 120 

TCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCAAAGTGGGACAATATATTTGT 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
GCT GT C GG GGGC GAGGAGAC ACC CAAGT GT GAGGACCT CAGGGT GGGACAATAT ATT T GT 180 

AAAGAT CCAAAAATAAAT GAC GCT AC GCAAGAAC CAGT T AACT GTACAAACTAC ACAGCT 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
AAAGAACCAAAAATAAATGATGCTACGCAAGAACCAGTTAATTGTACAAACTACACAGCT 240 

CAT GTTTCCTGTTTTC CAGC AC C CAACAT AACTT GTAAGGATT C CAGT GGCAAT GAAACA 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 

CAT GTT CAAT GTT TT C CAGC ACCCAAAAT AACT T GTAAGGATT T GAGT GGTAAT GAAACA 300 

CAT TT TACT GGGAAC GAAGT T GGT TT T TT CAAGC CC ATAT CTT GCCGAAAT GTAAAT GGC 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
CATTTTACTGGAAGTGAAGTCGGTTTTCTCAAGCCCATATCTTGCCGAAATGTGAATGGC 360 

TATTCCTACAAAGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGA 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III I I I I II I II I II I I I I I I I I I 

T AT TC GT ACAAAGT GGCAGT T GCATT AT CT CT CT TTT TGGGAT GGCT GGGAGCAGAT C GA 420 

TTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGA 660 

I I I I II I I I I I I I I I I I I M M I II I I I I I I I I II I I I I I I I I I I I I I III 
TTTTACCTCGGATATCCTGCCTTAGGCTTGTTAAAATTTTGCACCGTAGGATTTTGCGGA 48 0 

AT T GGGAGC CTAATT GATT T CAT T CT T AT T T CAAT GCAGATT GT TGGAC CTT CAGAT GGA 720 
I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
AT T GGGAGC CTAATT GAT T T CAT T CT TAT TT CAAT GCAGAT T GTT GGACCT T CAGAT GGA 54 0 

AGTAGT T ACAT T AT AGATT ACT AT GGAAC CAGACT T ACAAGACT GAGT ATTACT AAT GAA 780 
I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I 
AGTAGT T AC AT T ATAGACT ATT AT GGAAC CAGGCTT ACAAGACT CAGCAT T ACTAAT GAA 600 

ACATT TAGAAAAACGCAATTAT AT C CATAA 810 

I I I I I I I I I I I I I I II I II I I I I I I 
ACAT TTAGAAAAACC CAGCT GT ACC CATAA 630 



RESULT 12 

BD076181 

LOCUS 

DEFINITION 
ACCESSION 

"VERSION 

KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



BD076181 440 bp DNA linear PAT 27-AUG-2002 

5 f EST of tissue-nonspecific secretory protein. 

BD076181 

-BD07 6-181— 1— GI-:-22621-7 84 

JP 2001512011-A/129. 
Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 440) 

Edwards, J. B. D.M. , Duclert,A. and Lacroix,B. 
5' EST of tissue-nonspecific secretory protein 
Patent: JP 2001512011-A 129 21-AUG-2001; 
GENSET 



COMMENT OS Homo sapiens (human) 

PN JP 2001512011-A/129 

PD 21-AUG-2001 

PF 31-JUL-1998 JP 2000505289 

PR 01-AUG-1997 US 08/905135 

PI JEAN BAPTISTS DUMAS MILNE EDWARDS , AYMERI C DUCLERT , BRUNO PI 
LACROIX 

PC C12N15/09,Cl2N15/09,C07K14/47,C12Ql/68,C12N15/00,C12Nl5/00 CC 

blastn 

CC identity 97 

CC region 113. .315 

CC id AA143062 

CC est 

CC blastn 

CC identity 99 

CC region 304. .411 

CC id AA143062 

CC est 

CC blastn 

CC identity 97 

CC region 43. .120 

CC id AA143062 

CC est 

CC blastn 

CC identity 97 

CC region 44. .317 

CC id HUM172D06B 

CC est 

CC blastn 

CC identity 100 

CC region 340. .410 

CC id HUM172D06B 

CC est 

CC blastn 

CC identity 94 

CC region 8. .46 

CC id HUM172D06B 

CC est 

CC blastn 

CC identity 100 

CC region 125. .414 

CC id N47594 

CC est 

CC blastn 

CC identity 100 

CC region 49. .119 

CC id-N4-7594 

CC est 

CC blastn 

CC identity 98 

CC region 45. .385 

CC id HUM159G08B 

CC est 

CC blastn 

CC identity 95 

CC region 1. .47 

CC id HUM159G08B 



cc 


est 






cc 


blastn 






cc 


identity 98 






cc 


region 92. .316 






cc 


id N34957 






cc 


est 






cc 


blastn 






cc 


identity 100 






cc 


region 30. .97 






cc 


id N34957 






cc 


est 






cc 


blastn 






cc 


identity 91 






cc 


region 312. .379 




cc 


id N34957 






cc 


est 






cc 


Von Heijne matrix 




cc 


score 8.7 






cc 


seq AVALS LFLGWLGA/ DR 




FH 


Key 


Location/Qualifiers 


FT 


misc feature 


143. 


.345 


FT 


misc feature 


335. 


.442 


FT 


misc feature 


72. 


.149 


FT 


misc feature 


72. 


.345 


FT 


misc feature 


372. 


.442 


FT 


misc feature 


35. 


.73 


FT 


misc feature 


153. 


.442 


FT 


misc feature 


77. 


.147 


FT 


misc feature 


72. 


.412 


FT 


misc_f eature 


27. 


.73 


FT 


misc feature 


143. 


.367 


FT 


misc feature 


80. 


.147 


FT 


misc feature 


362. 


.429 


FT 


sig peptide 


24. 


.431. 



FEATURES Location/Qualifiers 
source 1. .440 

/organism="Homo sapiens" 
/ mo l_type=" genomic DNA" 
/db_xref="taxon: 9606" 

ORIGIN 



Query Match 53.9%; Score 436.8; DB 6; Length 440; 

Best Local Similarity 99.5%; Pred. No. 3.4e-106; 

Matches 436; Conservative 2; Mismatches 0; Indels 0; Gaps 0; 
Qy 166 GAGAAAGTGTCGGTCTCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGAG 225 



Db 3 GAGAAAGTGTCGGTCTCCAAGATGGCGGCCGCCTGGCSDTCTGGTCCGTCTGCTCCGGAG 62 

Qy 226 GCCGTGACGGCCAGACTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGG 285 

I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 63 GCCGTGACGGCCAGACTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGG 122 



Qy 

Db 



286 GGGGCTGTTGCCACCTCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCAAAGTG 345 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
123 GGGGCTGTTGCCACCTCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCAAAGTG 182 



Qy 346 G GACAAT ATATTTGT AAAGAT C CAAAAAT AAAT GAC GCT AC GCAAGAAC CAGTTAACT GT 405 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 183 GGACAATATATTTGTAAAGAT CCAAAAATAAAT GAC GCTAC GCAAGAAC CAGTTAACT GT 242 

Qy 4 06 ACAAACTACACAGCTCAT GTTT C CT GTTTTCCAGCACCCAACATAACTT GTAAGGATTCC 465 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II II I I I I I I I I I I I II I I I I I I I I I I I I 

Db 243 ACAAACT ACACAGCT CAT GTTTCCTGTTTTC CAGCACCCAACAT AACTT GT AAGGATT CC 302 

Qy 4 66 AGT GGCAAT GAAACACATTT T ACT GGGAACGAAGTTGGTTTT T T CAAGC C CAT AT CTT GC 525 

II II I I M I I I I I II I I I I II M M I I M I M M I I I I I II M I I I I M M M M I M M 

Db 303 AGTGGCAATGAAACACATTTTACTGGGAACGAAGTTGGTTTTTTCAAGCCCATATCTTGC 362 

Qy 526 CGAAATGTAAATGGCTATTCCTACAAAGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGG 585 

I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 
Db 363 CGAAATGTAAATGGCTATTCCTACAAAGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGG 422 

Qy 586 TTGGGAGCAGATCGATTT 603 

I I I I I I I I I I I I I I I M I 
Db 423 TTGGGAGCAGATCGATTT 440 



RESULT 13 

AC102262/c 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

REFERENCE 
AUTHORS 



Allen, N. , 
Boukhgalter , B . , 



AC102262 193660 bp DNA linear HTG 27-FEB-2003 

Mus musculus clone RP24-216B4, WORKING DRAFT SEQUENCE, 9 unordered 
pieces . 
AC102262 

AC102262.3 GI: 28570462 

HTG; HTGS_PHASE1; HTGS_DRAFT; HTGS_FULLTOP . 
Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

1 (bases 1 to 193660) 
Birren,B., Nusbaum, C. and Lander, E. 
Mus musculus, clone RP24-216B4 
Unpublished 

2 (bases 1 to 193660) 

Birren,B., Linton, L., Nusbaum, C, Lander, E., Ali,A. 
Anderson, S., Barna,N., Bastien,V., Boguslavkiy, L . 
Brown, A., Camarata,J., Campopiano, A. , Chang, J., Chazaro,B., 
Choepel,Y., Colangelo, M . , Collins, S., Collymore, A. , Cook, A., 
Cooke, P., DeArellano,K. , Dewar,K., Diaz, J. S., Dodge, S., Faro,S., 
Ferreira,P., FitzHugh,W., Gage,D., Galagan,J., Gardyna,S., 
Ginde,S., Gord, S . , Goyette,M., Graham, L. , Grand- Pier re, N. , 
Hagos,B., Heaford,A. , Horton,L. f Hulme,W., Iliev, I., Johnson, R. , 

"JolTe^CT^KamatTAT^Karata^ — LaRocque^Knr-^ — 

Lamazares, R. , Landers, T., Lehoczky, J. , Levine,R., Liu,G., 
MacLean,C, Macdonald, P . , Major, J., Marquis, N., Matthews, C, 
McCarthy, M. , McEwan,P., McKernan,K., McPheeters, R. , Meldrim, J. , 
Meneus,L., Mihova,T., Mlenga,V., Murphy, T., Naylor,J., Nguyen, C, 
Norbu,C, Norman, C.H., O 1 Connor f T., 0 1 Donnell, P . , 0'Neil,D., 
Oliver, J., Peterson, K., Phunkhang, P . , Pierre, N., Pollara,V., 
Raymond, C, Retta,R w Rieback,M. , Riley, R. , Rise,C, Rogov,P., 
Roman, J. , Rosetti,M., Roy, A., Santos, R., Schauer,S., Schupback, R. , 
Seaman, S,, Severy,P., Spencer, B., Stange-Thomann, N . , Stojanovic,N. 
Strauss , N . , Subramanian, A. , Talamas , J. , Tesfaye,S., Theodore, J., 



TITLE 
JOURNAL 

REFERENCE 
AUTHORS 



TITLE 
JOURNAL 

COMMENT 



Topham,K., Travers,M., Travis, N., Trigilio,J., Vassiliev, H . , 
Viel,R., Vo,A., Wilson, B., Wu,X., Wyman,D., Ye,W.J., Young, G., 
Zainoun, J., Zembek,L., Zimmer,A. and Zody,M. 

Direct Submission 

Submitted (23-NOV-2001 ) Whitehead Institute/MIT Center for Genome 
Research, 320 Charles Street, Cambridge, MA 02141, USA 
3 (bases 1 to 193660) 

Birren,B., Nusbaum, C . , Lander,E., Abouelleil, A. , Allen, N., 
Anderson, S. , Arachchi, H.M. , Barna,N., Bastien,V., Bloom,T., 
Boguslavkiy,L. , Boukhgalter , B . , Camarata,J., Chang, J., Choepel,Y., 
Collymore,A. , Cook, A. , Cooke, P., Corum, B., DeArellano, K. , 
Diaz, J. S., Dodge, S., Dooley,K., Dorris,L., Erickson,J., Faro,S., 
Ferreira,P., FitzGerald,M. , Gage,D., Galagan,J., Gardyna,S., 
Graham,L., Grand-Pierre, N . , Hafez, N., Hagopian,D., Hagos,B., 
Hall, J., Horton,L., Hulme,W., Iliev,I., Johnson, R., Jones, C, 
Kamat,A. , Karatas,A., Kells,C, Landers, T., Levine,R., 
Lindblad-Toh,K. , Liu, G . , Lui, A. , Mabbitt,R., MacLean,C, 
Macdonald, P. , Major, J. , Manning, J., Matthews, C, McCarthy, M. , 
Meldrim,J., Meneus,L., Mihova,T., Mlenga,V., Murphy, T., Naylor,J., 
Nguyen, C, Nicol,R., Norbu,C, 0'Connor,T., 0 1 Donnell, P . , 
0'Neil,D., Oliver, J., Peterson, K. , Phunkhang, P . , Pierre, N., 
Rachupka,A. , Ramasamy,U., Raymond, C, Retta,R., Rise,C, Rogov,P., 
Roman, J., Schauer,S., Schupback, R. , Seaman, S., Severy,P. f Smith, C. 
Spencer, B. , Stange-Thomann, N . , Stoj anovic,N. , Stubbs,M. , 
Talamas,J., Tesfaye,S., Theodore, J., Topham,K., Travers,M., 
Vassiliev, H. , Venkataraman, V. S . , Viel,R., Vo,A. , Wilson, B., Wu,X., 
Wyman,D., Young, G., Zainoun,J., Zembek,L., Zimmer,A. and Zody,M. 
Direct Submission 

Submitted (27-FEB-2003 ) Whitehead Institute/MIT Center for Genome 

Research, 320 Charles Street, Cambridge, MA 02141, USA 

On Feb 27, 2003 this sequence version replaced gi:22381123. 

All repeats were identified using RepeatMasker : 

Smit, A* F. A. & Green, P. (1996-1997) 

http : //ftp . genome . Washington . edu/RM/RepeatMasker . html 
Genome Center 

Center: Whitehead Institute/ MIT Center for Genome Research 

Center code: WIBR 

Web site: http://www-seq.wi.mit.edu 

Contact : sequence_submis sions@genome . wi . mit . edu 
Project Information 

Center project name: L18275 

Center clone name: 216_B_4 
Summary Statistics 

Sequencing vector: Plasmid; n/a; 100% of reads 

Chemistry: Dye-terminator Big Dye; 100% of reads 

Assembly program: Phrap; version 0.960731 
Consensus— qua-lity;— 191-505-bases-at— least-Q4 0 

Consensus quality: 192178 bases at least Q30 

Consensus quality: 192605 bases at least Q20 

Insert size: 188000; agarose-fp 

Insert size: 192860; sum-of-contigs 

Quality coverage: 10.3 in Q20 bases; agarose-fp 

Quality coverage: 10.0 in Q20 bases; sum-of-contigs 



* NOTE: This is a 1 working draft 1 sequence. It currently 

* consists of 9 contigs . The true order of the pieces 

* is not known and their order in this sequence record is 



* arbitrary. Gaps between the contigs are represented as 

* runs of N, but the exact sizes of the gaps are unknown. 

* This record will be updated with the finished sequence 

* as soon as it is available and the accession number will 



* 


be preserved. 








* 


1 


16245: 


contig 


of 16245 


bp in length 




16246 


16345: 


gap of 


100 bp 






16346 


17545: 


contig 


of 1200 


bp in length 


* 


17546 


17645: 


gap of 


100 bp 






17646 


21998: 


contig 


of 4353 


bp in length 




21999 


22098: 


gap of 


100 bp 




* 


22099 


31133: 


contig 


of 9035 


bp in length 




31134 


31233: 


gap of 


100 bp 




* 


31234 


41801: 


contig 


of 10568 


bp in length 




41802 


41901: 


gap of 


100 bp 






41902 


62223: 


contig 


of 20322 


bp in length 




62224 


62323: 


gap of 


100 bp 






62324 


91590: 


contig 


of 29267 


bp in length 


* 


91591 


91690: 


gap of 


100 bp 






91691 


123816: 


contig 


of 32126 


bp in length 


* 


123817 


123916: 


gap of 


100 bp 




* 


123917 


193660: 


contig 


of 69744 


bp in length 



FEATURES Location/Qualifiers 
source 1. .193660 

/organism="Mus musculus" 
/mol_type=" genomic DNA" 
/db_xref ="taxon: 10090" 
/ cl one= "RP24-21 6B4 " 

/clone_lib="RPCI-24 Male Mouse BAC" 
misc_f eature 1. .16245 

/note= "as sembly_f ragment 

clone_end: SP6 

vector_side : left" 
misc_f eature 16346. .17545 

/note="assembly_f ragment" 
misc_feature 17646. .21998 

/ note=" as sembly_f ragment" 
misc_f eature 22099. .31133 

/note= "as sembly_f ragment" 
misc_f eature 31234. .41801 

/ note=" as sembly_f ragment" 
misc_f eature 41902. .62223 

/ note= " as sembly_f ragment" 
misc_f eature 62324. .91590 

/note= "as sembly_f ragment " 
misc_f eature 91691. .123816 

I no t e- " a s s emb l y — f-r a gmen t-^ — 

misc_feature 123917. .193660 

/note= " as sembly_f ragment 

clone_end: T7 

vector_side: right" 

ORIGIN 



Query Match 53.2%; Score 430.6; DB 2 

Best Local Similarity 81.7%; Pred. No. 5.2e-104 
Matches 535; Conservative 0; Mismatches 114 



Length 193660; 
Indels 6; Gap 



Qy 159 AAGTGGCGAGAAAGTGTCGGTCTCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGC 218 

|| Ml Ml III I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 138 927 AAAAGGCTGGGAGGACGCAGTGCCCAACATGGCGGCTGCCTGGCCCGCGGGCCGGGCTTC 



138868 



Qy 219 TCCGGAGGCCGTGACGGCCAGACTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGG 278 

Ml | I I I I I I III I I I II I I I I I I I II I I I M 

Db 138867 CCCAGCAGCCGGGCCTCCGGGCCTTCTCCGCACTCTGTGGCTCGTGATGGTCACCGAGGG 



138808 



Qy 279 ACCCTGGGGGGCTGTTGCCACCTCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCT 338 

I I I I I II II I I I I I I I I Ml I III III I I I I I I I I I I I I I 
Db 138807 ACACTGTGGGGCTGCTGCCTCTGGCGCTGTCGGGGGCGAGGAG AGGT GTGAGGACCT 



138751 



Qy 339 CAAAGT GGGACAAT AT ATT T GT — AAAGAT C CAAAAATAAAT GAC GCT AC GCAAGAACC A 396 

|| I I II I I I I I I I I I II I I I M I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 
Db 138750 CAGGGT GGGACAAT AT AT T T GTAAAAAGAAC CAAAAATAAAT GAT GCT ACGCAAGAACCA 



138691 



Qy 397 GT TAACT GTACAAACT AC ACAGCT CATGT TTCCTGTTTTC CAGCAC C CAACATAACT T GT 456 

| | | I I I II I I I I I I I I I I I I M I I I I I I I I I II I I I I I I I I I I I I I I I I I I 1 I I 
Db 138690 GTTAATTGTACAAACTACACAGCTCATGTTCAATGTTTTCCAGCACCCAAAATAACTTAT 



138631 



Qy 457 - AAGGAT T C C AGT GGCAAT GAAAC ACAT TTT ACT GGGAAC GAAGT T GGT T TTT TCAAGC C 515 

I I I M I I I I I I I I II I I I I I I I I I I I I I I I M I I I I I I I I II I I I I 

Db 138630 AAAGGAT TT GAGTGGTAAT GAAAC ACAT T TT ACT GGAAGT GAAGT C GGT T T T CT CAAGC C 



138571 



Qy 516 CAT AT CT T GC C GAAAT GTAAAT GGCT AT T CCTACAAAGT GGCAGT C GC AT T GT CT CTTT T 575 

I I I I Mill I I I I I I I I I I I I I I I II I II I I I I UNI I II I I II 

Db 138570 CATAACTTGCCGAAATGTGAATGGCTATTCGTACAT^lGTGGCAGTTGCATTATCTCTCTT 



138511 



Qy 576 TCTTGGATGGTTGGGAGCAGATCGATTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAA 635 

| | | I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I M I I II I I I II I II I 
Db 138510 TTTGGGATGGCTGGGAGCAGATCGATTTTACCTCGGATATCCTGCCTTACGCTTGTTAAA 

138451 

Qy 636 GTTT TGCACT GTAGGGT T TT GTGGAAT T GGGAGC CTAAT T GATTT CAT T CT T AT T T CAAT 695 

M I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I I I I 
Db 1384 50 TTTTTGCACCGTAGGATTTTGCGGAATTGGGAGCCTAATTGATTTCATTCTTATTTCAAT 

138391 

Qy 696 GCAGATT GTT GGACCTT CAGATGGAAGTAGTTACATT ATAGATTACT ATGGAACCAGACT 755 

11111111111-mi1111iii1iiiii-HiHH-IH+IHH-IHH-IH-IHH-IHHH-l-l-H-l-l- 



Db 138390 GCAGATT GT T GGAC CT T CAGAT GGAAGT AGTTACATT AT AGACT ATT AT GGAAC C AGGCT 

138331 

Qy 756 TACAAGACT GAGT ATT ACTAAT GAAACAT TT AGAAAAACGCAATT AT ATCCAT AA 810 

| | | I I I I I I || I I I I I I I I I I I I I I I II I I I I I I I II I I I I I M I I I 
Db 138330 TACAAGACTCAGCATTACTAATGAAACATTTAGAAAAACCTAGCTGTACCCATAA 138276 



RESULT 14 
BD076249 



LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



COMMENT 



PN 
PD 
PF 
PR 
PI 



BD076249 455 bp DNA linear PAT 27-AUG-2002 

5' EST of tissue-nonspecific secretory protein. 

BD076249 

BD07 6249.1 GI : 22621852 
JP 2001512011-A/197. 
Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 455) 

Edwards , J. B . D.M. , Duclert,A. and Lacroix,B. 
5 1 EST of tissue-nonspecific secretory protein 
Patent: JP 2001512011-A 197 21-AUG-2001; 
GEN SET 

OS Homo sapiens (human) 
JP 2001512011-A/197 
21-AUG-2001 

31-JUL-1998 JP 2000505289 
01-AUG-1997 US 08/905135 

JEAN BAPTISTE DUMAS MILNE EDWARDS , AYMERI C DUCLERT , BRUNO PI 
LACROIX 

PC C12N15/09 / C12Nl5/09,C07K14/47,Cl2Ql/68,C12N15/00,Cl2N15/00 CC 

blastn 
CC 

CC recrion 125. .358 
CC 
CC 
CC 
CC 

CC recrion 49. .119 
CC 
CC 
CC 
CC 

CC recrion 374. .438 
CC 
CC 
CC 
CC 

CC reaion 113. .315 
CC 
CC 
CC 
CC 

CC reaion 43. .120 
CC 

-cc- 
cc 

CC 

CC reaion 304. .355 
CC 
CC 
CC 
CC 

CC reaion 371. .416 
CC 
CC 



identity 99 
region 125. 
id N47594 
est 

blastn 
identity 100 
region 49. . 
id N47594 
est 

blastn 
identity 96 
region 374. 
id N47594 
est 

blastn 
identity 96 
region 113. 
id AA143062 
est 

blastn 
identity 97 
region 43. . 
id AA143062 
~est 
blastn 
identity 98 
region 304 . 
id AA143062 
est 

blastn 
identity 95 
region 371. 
id AA143062 
est 



CC blastn 

CC identity 97 

CC region 44. .317 

CC id HUM172D06B 

CC est 

CC blastn 

CC identity 97 

CC region 370. -416 

CC id HUM172D06B 

CC est 

CC blastn 

CC identity 97 

CC region 8. .46 

CC id HUM172D06B 

CC est 

CC blastn 

CC identity 97 

CC region 45. . 359 

CC id HUM159G08B 

CC est 

CC blastn 

CC identity 97 

CC region 1. .47 

CC id HUM159G08B 

CC est 

CC blastn 

CC identity 98 

CC region 92. .316 

CC id N34957 

CC est 

CC blastn 

CC identity 100 

CC region 30. .97 

CC id N34957 

CC est 

CC Von Heijne matrix 

CC score 5.5 

CC seq LVGVLWFVSVTTG/PW 

CC n=a, g, c or t 

FH Key Location/Qualifiers 

FT misc_f eature 141. .374 

FT misc_f eature 65. .135 

FT misc_f eature 388. .452 

FT misc_f eature 131. .333 

FT misc_f eature 60. .137 

FT misc_f eature 323. .374 

"FT mis c-f eature 388^ — r433- 

FT misc_f eature 60. .333 

FT misc_f eature 388. .434 

FT misc_f eature 23. .61 

FT mi sc_f eature 60. .374 

FT mi sc_f eature 15. .61 

FT misc_f eature 131. .355 

FT misc_f eature 68. .135 

FT sig_peptide 12. .104 

FT mi sc_f eature 288 

FT misc feature 375. .376 



FT misc_feature 385. .387. 
FEATURES Location/Qualifiers 
source 1. .455 

/organism=="Homo sapiens" 
/mol_type=" genomic DNA" 
/db_xref="taxon:9606" 

ORIGIN 

Query Match 52.5%; Score 425.2; DB 6; Length 455; 

Best Local Similarity 96.3%; Pred. No. 4.5e-103; 

Matches 439; Conservative 3; Mismatches 12; Indels 2; Gaps 1; 

Qy 177 GGTCTCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGAGGCCGTGACGGC 236 

I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I II : I I I I I I I I I II I I I I I I I I I I I 

Db 2 GGTCTCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGKCTGCTCCGGAGGCCGTGACGGC 61 

Qy 237 CAGACTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCTGTTGC 296 

I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
Db 62 CAGACTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCTGTTGC 121 

Qy 297 CACCTCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCAAAGTGGGACAATATAT 356 

I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I M I I I I I I I I I I I I II II I I I 
Db 122 CACCTCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCAAAGTGGGACAATATAT 181 

Qy 357 T T GTAAAGATCCAAAAATAAAT GACGC TAC GCAAGAACCAGTTAAC TGTACAAACTACAC 416 

I I I I II I II II I I II I I I I I I I I M I I I I I I I I I I I I I I M I I I I I I I I I I I I I I II I I I 

Db 182 TTGTAAAGATCCAAAAATAAATGACGCTAC GCAAGAAC CAGTTAAC TGTACAAACTACAC 241 

Qy 417 AGCT CAT GTTTC CT GT T T T C CAGCACC CAACATAACTT GTAAGGAT T C CAGT GGCAAT GA 4 76 

I I I II II I I I I I I II II I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 
Db 242 AGCT CAT GTTT C CT GT T T T C CAGCACCCAACATAACTT GTAAGGATN C CAGT GGCAAT GA 301 

Qy 477 AACACATTTTACTGGGTUVCGAAGTTGGTTTTTTCAAGCCCATATCTTGCCGAAATGTAT^A 536 

I I I I I I I II M I I I I I I I I I I I I M I I I I II I I I I I I II I I I I I I I I I I I I I II I I I I I I 
Db 302 AACACATTTTACTGGGAACGAAGTTGGTTTTTTCAAGCCCATATCTTGCCGAAATGTAAA 361 

Qy 537 TGGCTATTCCTACAAAGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGA 596 

I I I M I II II 11 1 I : : I I I I I I I I | I II I I I I I I I I I I I I I I I I I I I 

Db 362 TGGCTATTCCTAC — NNTKAGCAGTNNNWTGTCTCTTTTTCTTGGATGGTTGGGAGCAGA 419 

Qy 597 TCGATTTTACCTTGGATACCCTGCTTTGGGTTTGTT 632 

I I I II I I I II II I I I I I I I I I I I I I I I I I I I I I I I I 
Db 420 TCGATTTTACCTTGGATACCCTGCTTTGGGTTTGTT 455 



RESULT 15 
AX892343 — ~ 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 



AX892343 487 bp DNA 

Sequence 8206 from Patent EP1033401. 
AX892343 

AX892343.1 GI:40047227 



linear PAT 18-DEC-2003 



Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Primates; 

1 



Craniata; Vertebrata; Euteleostomi; 
Catarrhini; Hominidae; Homo. 



AUTHORS Dumas Milne Edwards , J. B . , Duclert,A. and Giordano, J. Y. 
TITLE Expressed sequence tags and encoded human proteins 

JOURNAL Patent: EP 1033401-A 8206 06-SEP-2000; 
Genset (FR) 
FEATURES Location/Qualif iers 

source 1. .487 

/organism="Homo sapiens"' 
/mol__type="unas signed DNA" 
/db_xref="taxon:9606" 

ORIGIN 

Query Match 50.8%; Score 411.4; DB 6; Length 487; 

Best Local Similarity 91.4%; Pred. No. 2.3e-99; 

Matches 445; Conservative 9; Mismatches 10; Indels 23; Gaps 1; 

GCGAGAAAGTGTCGGTCTCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGG 223 

| I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I : I I I I I I I I II : I I I : I I I I I I I 

GCGAGAAAGTGTCGGTCTCCT^AGATGGCGGCCGCMTGGACGTCTGGWCCGAMTGCACCGG 60 

AGGCCGTGACGGCCAGACTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCT 283 
I | | | | M I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I II : I I I : I I I I I I I I I I I I I I I 
AAGC C GT GAC GGC C AGAMT CGTT GGT GT C CT GT GGT T C GTMT CART CACTACAGGAC C CT 12 0 

GGGGGGCTGTTGCCACCTCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCAAAG 343 

| Ml II I II Ml II I lllilltl I 1111:11 I I III I I I MM I M I II IIMIMI II 
GGGGGGCTGTTGCCACCTCCGCCGGGGGCRAGGAGTCGCTTAAGTGCGAGGACCTCAAAG 18 0 

TGGGACAATATATT T GT AAAGAT C CAAAAAT AAAT GA 38 0 

M :: | | | | | | | I I I I I I I I M I I II I I I I I I 

TGRRACAATAT CCT CTGTGGAGAACACCCCCCCAT GGAGGCGAGATC CAAAAAT AAAT GA 240 

C GCTAC GCAAGAACCAGT TAACT GT ACAAACT ACACAGCT CAT GTTTCCTGTTTTC CAG C 44 0 

I | | | | | | I I II I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I M I I I I I M I I I I I I I I 
C GCT ACGCAAGAAC CAGT TAACT GT ACAAACTAC ACAGCT CAT GT TT C CT GTT T T CCAGC 300 

ACCCAACATAACTT GTAAGGATTCCAGT GGCAAT GAAACACATTTTACTGGGAACGAAGT 500 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I II I I M I 

AC C CAACAT AACT T GTAAGGATT C CAGT GGCAAT GAAACAC ATT TT ACT GGGAAC GAAGT 360 

TGGTTTTTTCAAGCCCATATCTTGCCGAAATGTAAATGGCTATTCCTACAAAGTGGCAGT 560 

I I I I I M I I I I I I I I I I I I I I I I I I I I I I I M I I I I II I I I I I I I I I I I I I I I II I I I I I 
TGGTTTTTTCAAGCCCATATCTTGCCGAAATGTAAATGGCTATTCCTACAAAGTGGCAGT 420 

CGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGATTTTACCTTGGATACCCTGC 62 0 
I I I I M I I I I I I I M I I I I I I I Ml I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I M I 
CGCAT T GT CT CT TT TT CTT GGAT GGT T GGGAGC AGAT C GAT TT T AC CT T GGAT AC CCT GC 480 



I II I I I I 



yy 


1 64 
± \j i 


Du 


-L 




i i 


Du 




s2Y 


284 


Db 


121 


vy 


344 


Db 


181 


Qy 


381 


Db 


241 


Qy 


441 


Db 


301 


Qy 


501 


Db 


361 


Qy 


561 


Db 


421 


Qy 


621 


Db 


481 
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OM nucleic - nucleic search, using sw model 

Run on: March 4, 2004, 03:41:27 ; Search time 385 Seconds 

(without alignments) 
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Title: 

Perfect score: 
Sequence : 

Scoring table: 



US-09-852-100B-1 
810 

1 atgcatattttaaaagggtc aaacgcaattatatccataa 810 

IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 



Searched: 3373863 seqs, 2124099041 residues 

Total number of hits satisfying chosen parameters: 



6747726 



Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



Database : N_Geneseq_29 Jan04 : * 

1: geneseqnl980s : * 
2 : geneseqnl990s : * 
3: geneseqn2000s : * 
4: geneseqn2001as : * 
5: geneseqn2001bs : * 
6: geneseqn2002s : * 
7: geneseqn2003as : * 
8 : geneseqn2003bs : * 
9: geneseqn2003cs : * 
10 : geneseqn2004s : * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 



-Result Query 

No. Score Match Length DB ID Description 
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100. 


0 
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AAX05735 


Aax05735 


Human bet 


2 


810 


100. 


0 


810 


3 


AAZ52369 


Aaz52369 


Human bet 


3 


810 


100. 


0 


810 
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AAD51940 


Aad51940 


Human BBP 


4 


810 


100. 


0 
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7 


AAD51979 


Aad51979 


Human BBP 


5 
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4 
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2 


AAX97705 


Aax97705 
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AAA77946 


Aaa77946 


cDNA enco 


7 


499 


61. 


6 


508 


3 


AAA77958 


Aaa77958 


cDNA enco 
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499 
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4 


AAI28684 
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Colon turn 
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i 9ft Mf\ 


V-.VJXVJ1I U Uill 




10 


499 


61 , 


. 6 


508 


/ 


ABZozooz 


r\D O v 






11 
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487 
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Aacu4 iji 


Human cpr 
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8 
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Acniy / uo 


n LillLcl 1 1 CLU.U. 




17 


322 


39 


. 8 


323 


7 


ACD92727 


Acctyz / Z / 


nuiuan cox 




18 


320. 4 


39 


. 6 


323 


7 


ACD92728 


Acayz /zo 


Human col 


c 


19 


182.4 


22 


. 5 
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6 


ABQ5708 9 


Aoqo / Uo y 


nuiriari cox 




20 


153. 2 


18 


. 9 


515 


9 


ADB56631 


ACLD jodjI 


i oxici t-Y 




21 


115. 4 


14 


.2 


433 


5 


ABV178 09 


advi / o uy 


Human piro 




22 


115. 4 


14 


.2 


487 


5 


ABV4 7601 


A Wtrvl *7 £fl 1 

ADVI / DU1 


riuman pro 




23 


113 


14 


. 0 


292 


7 


AAD51978 


7i~ JCl Q*7Q 

Aaujiy / o 


U )lm ^ n DDT) 

riuman ddjt 




24 


99. 8 


12 


. 3 


298 


2 


AAX85735 


Aaxo o 1 oj 


JNovex cun 


c 


25 


64. 6 


8 


. 0 


2771 


4 


ABL16838 


AVvl 1 CQQQ 

ADllDOvJo 


Drosophil 


c 


26 


64. 6 


8 


. 0 


3642 


4 


ABL15742 


ADIIj / 4Z 


Drosophil 




27 


64.4 


8 


. 0 


706 


4 


ABL16839 


ADllDO jy 


Drosophil 




28 


51 


6 


.3 


1369 


2 


AAX85024 


AaXo 


Human sec 




29 


51 


6 


.3 


1369 


7 


ADA56147 


Aaaobifi / 


Gene enco 




30 


51 


6 


.3 


1369 


7 


ACD18950 


ACdlo y Oil 


Novel hum 




31 


51 


6 


. 3 


1369 


7 


ACC50529 


Accouoz y 


Human sec 




32 


51 


6 


. 3 


1369 


7 


ABZ71291 


ADZ / ±z y 1 






33 


51 


6 


.3 


1369 


8 


ADB91205 


AClD y 1 Z U O 


Human sec 




34 


51 


6 


. 3 


1369 


9 


ADC73589 


ACIC / o D o y 


riuman sec 




35 


50.4 


6 


.2 


690 


3 


AAA64413 


7\-»^Cj1 y| 1 "3 

Aaab4 4 1J 


Open iread 




36 


50.4 


6 


.2 


854 


3 


AAA64412 


Aaao44 ±z 


jjjna encoa 


c 


37 


50.2 


6 


.2 


439 


6 


ABK62922 


ADKbzyZZ 


Rat seque 


c 


38 


50.2 


6 


.2 


439 


9 


ADB56953 


AdD jbyjo 


Toxicity - 




39 


49.8 


6 


.1 


741 


3 


AAA64409 


Aaab4 4 uy 


Open read 




40 


49.8 


6 


. 1 


746 


3 


AAZozo / 1 


Ad Z O Z O /X 


Ull tyi -3 T~> V\ ^> I - 

nLliLLciIl UC L- 




41 


49.8 


6 


.1 


1406 


7 


ACC51100 


Acc51100 


Human Amy 




42 


49.8 


6 


.1 


1455 


4 


AAF80523 


Aaf80523 


Receptor 




43 


49.8 


6 


.1 


1473 


3 


AAA64408 


Aaa64408 


DNA encod 




44 


49.8 


6 


.1 


1473 


3 


AAA64425 


Aaa64425 


DNA encod 




45 


49.8 


6 


.1 


1473 


3 


AAA64424 


Aaa64424 


DNA encod 



ALIGNMENTS 



RESULT 1 
AAX05735 

ID AAX05735 standard; mRNA; 810 BP. 
XX 

~AC AAX05735; 

XX 

DT 27-APR-1999 (first entry) 
XX 

DE Human beta-amyloid peptide-binding protein (BBP) encoding mRNA. 
XX 

KW Beta-amyloid peptide binding protein; BBP; beta-amyloid protein; BAP; 

KW human; Alzheimer f s disease; ss. 

XX 

OS Homo sapiens . 
XX 



FH Key Location/Qualifiers 

FT CDS 1. .810 

FT /*tag= a 

FT /product^ "BBP " 

XX 

PN W09846636-A2. 
XX 

PD 22-OCT-1998. 
XX 

PF 14-APR-1998; 9 8WO-US0074 62 . 
XX 

PR 16-APR-1997; 97US-0064583P . 
XX 

PA (AMHP ) AMERICAN HOME PROD CORP. 
XX 

PI Ozenberger BA, Kajkowski EM, Jacobsen JS, Bard JA, Walker SG; 
XX 

DR WPI; 1999-080736/07. 

DR P-PSDB; AAW94291. 
XX 

PT Polynucleotide encoding beta-amyloid peptide binding protein - used to 

PT identify inhibitors of beta-amyloid peptide for treating Alzheimer's 

PT disease. 
XX 

PS Claim 1; Page 43-44; 59pp; English. 
XX 

CC This represents a nucleotide sequence encoding a beta-amyloid peptide 

CC binding protein (BBP) . The polynucleotide comprising the entire BBP 

CC nucleotide sequence of clone BBPl-fl is deposited under the accession 

CC number ATCC 98617. The polynucleotide comprising a fragment of BBP 

CC (nucleotides 202-807 of the full length BBP) of clone pEK196 is deposited 

CC as ATCC 98399. Host cells transformed with a vector comprising the BBP 

CC nucleic acid are used for the recombinant production of the protein. The 

CC protein can be used in a method for diagnosing a disease characterised by 

CC aberrant expression of human beta-amyloid protein (BAP) . The protein can 

CC also be used in a method for screening for compounds which regulate 

CC expression of a BAP binding protein. The proteins, antibodies and 

CC identified compounds can be used in the treatment or prevention of 

CC Alzheimer's disease 

XX 

SQ Sequence 810 BP; 204 A; 183 C; 202 G; 221 T; 0 U; 0 Other; 

Query Match 100.0%; Score 810; DB 2; Length 810; 
Best Local Similarity 100.0%; Pred. No. 2e-233; 

Matches 810; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

-Qy 1~~ AT GCAT AT T T T AAAAGGGT CT C C CAAT GT GATT CCAC GGGGT GAG GGGGAGAAGAAGAG G-6 0 

I I I I M M I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I II 

Db 1 AT GCAT AT TTT AAAAGGGT CT C C CAAT GT GAT T CC AC GGGCT C AC GG GC AGAAGAACAC G 60 

Qy 61 C GAAGAGAC GGAACT GGC CT CT ATC CTAT GCGAGGT C CCT TT AAGAACCT CGC C CT GTT G 120 

I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 

Db 61 CGAAGAGACGGAACTGGCCTCTATCCTATGCGAGGTCCCTTTAAGAACCTCGCCCTGTTG 120 

Qy 121 CCCTTCTCCCTCCCGCTCCTGGGCGGAGGCGGAAGCGGAAGTGGCGAGAAAGTGTCGGTC 180 

I | I I I I II I I II M I II I I I M I I I I I I I I I I I I I I I I I I I I I I I I M I I I M I I I I I M 

Db 121 CCCTTCTCCCTCCCGCTCCTGGGCGGAGGCGGAAGCGGAAGTGGCGAGAAAGTGTCGGTC 180 



Qy 


181 


Db 


181 


Ov 


241 


Db 


241 


Ov 


301 


Db 


301 


Ov 


361 


Db 


361 


Ov 


421 


Db 


421 


Ov 


481 


Db 


481 


Ov 


541 


Db 


541 


Ov 


601 


Db 


601 


Ov 


661 


Db 


661 


Qy 


721 


Db 


721 


Qy 


781 


Db 


781 



TCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGAGGCCGTGACGGCCAGA 24 0 

I I I | I I I I I I II I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

TCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGAGGCCGTGACGGCCAGA 24 0 

CTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCTGTTGCCACC 300 

| | I ! II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I M I I I I I I I I I I I II I 

CTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCTGTTGCCACC 300 

TCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCAAAGTGGGACAATATATTTGT 360 

| | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
TCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCAAAGTGGGACAATATATTTGT 360 

AAAGAT C CAAAAAT AAAT GAC GCT ACGCAAGAACCAGTTAACT GT ACAAACTACAC AGCT 420 
| | | | | I I I I M I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I 
AAAGAT CCAAAAAT AAAT GACGCT AC GCAAGAAC C AGT TAACTGT ACAAACTACACAGCT 420 

CAT GTT T CCT GTTT T C CAGCACCCAACATAACT T GT AAGGAT T C CAGT GGCAAT GAAACA 480 

I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I 
CATGTTT CCT GTTTTCCAGCACCCAACATAACTT GTAAGGATTCCAGTGGCAAT GAAACA 480 

CATTTTACTGGGAACGAAGTTGGTTTTTTCAAGCCCATATCTTGCCGAAATGTAAATGGC 540 

I | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M II II I I I I I I I I I I I 

CATTTT ACT GGGAACGAAGT TGGTTTTTT CAAGC C CAT AT CT T GC C GAAAT GT AAAT GGC 540 

TATTCCTACAAAGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGA 600 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I II I I I I I I I I I I I I 
TATT C CTACAAAGT GGCAGT CGCATT GT CT CTT T T T CT T GGAT GGT T GGGAGC AGAT C GA 600 

TTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGA 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 
TTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGA 660 

AT T GGGAGCCTAATT GATT T CAT T CT TAT T T CAATGC AGATT GTT GGACCT T CAGAT GGA 720 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I II I I I I 
ATT GGGAGCCTAAT T GATTT CATT CTTATT T CAATGCAGAT T GT T GGAC CTT CAGAT GGA 720 

AGTAGTTACATTATAGATT ACTAT GGAACCAGACTTACAAGACT GAGTATTACTAATGAA 780 

I | | | I I I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 
AGT AGT TACATT ATAGATT ACT AT GGAAC CAGACTT ACAAGACT GAGT AT T ACTAAT GAA 78 0 

ACATTTAGAAAAACGCAATTATATCCATAA 810 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
ACATTTAGAAAAACGCAATTATATCCATAA 810 



-RESULT~2 

AAZ52369 

ID AAZ52369 standard; cDNA; 810 BP. 
XX 

AC AAZ52369; 
XX 

DT 24-JUL-2000 (first entry) 
XX 

DE Human beta-amyloid peptide (BAP) binding protein, BBP1 encoding cDNA. 
XX 

KW Beta-amyloid peptide binding protein; BBP; BAP; tumour; suppressor; 



KW G-protein coupled receptor; GPCR; integral membrane protein; antigen; 

KW neuronal cell; nonhuman primate; NHP; G-protein signalling pathway; 

KW apoptosis; immunogen; therapeutic; treatment; prevention; diagnostic; ss. 

XX 

OS Homo sapiens. 
XX 

FH Key Location/Qualifiers 

FT CDS 1. .810 

FT /*tag= a 

FT /product= "Human BBP1 protein" 

FT /note= "Member of G-protein coupled receptor superf amily" 

XX 

PN WO200022125-A2. 
XX 

PD 20-APR-2000. 
XX 

PF 13-OCT-1999; 99WO-US021621 . 
XX 

PR 13-OCT-1998; 98US-0104104P . 
XX 

PA (AMHP ) AMERICAN HOME PROD CORP. 
XX 

PI Ozenberger BA, Kajkowski EM, Lo CF; 
XX 

DR WPI; 2000-317982/27. 

DR P-PSDB; AAY70759. 
XX 

PT Novel G-protein-coupled receptor-like proteins and polynucleotides useful 

PT for regulating apoptosis, comprises integral membrane protein traversing 

PT the membrane twice. 
XX 

PS Example 1; Page 60-61; 68pp; English. 
XX 

CC The present sequence is the cDNA encoding beta-amyloid peptide (BAP) 

CC binding protein-1 (BBPl) . It is an integral membrane protein, that 

CC traverse the membrane twice. It is related to G protein-coupled receptor 

CC (GPCR) protein superf amily . It interacts with G-alpha proteins and 

CC regulates the activity of G-protein signalling pathways. BBP genes are 

CC widely expressed in neuronal cells of nonhuman primate (NHP) brain and 

CC overexpressed in some tumours. It functions as a suppressor of apoptosis 

CC induction. BBP proteins are used as immunogens to raise antibodies, 

CC useful as therapeutics and as antigens in solid phase assays. They are 

CC also useful as reagents to identify molecules which effect the 

CC interaction of BBP and a cloned protein, that are useful in the treatment 

CC or prevention of diseases associated with apoptosis. The polynucleotides 

CC are useful for diagnostics. Note: In claim 5, the patent claims an amino 

~CC acid - sequence— f rom~f igure-2— However^f igure— 2-does-not— eonta-i-n-any - 

CC sequence. It is inferred from the disclosure that the figure 2 sequence 

CC refers to BBPl protein, encoded by this polynucleotide sequence 

XX 

SQ Sequence 810 BP; 204 A; 183 C; 202 G; 221 T; 0 U; 0 Other; 

Query Match 100.0%; Score 810; DB 3; Length 810; 

Best Local Similarity 100.0%; Pred. No. 2e-233; 

Matches 810; Conservative 0; Mismatches 0; Indels 0; Gaps 0 



1 AT GCAT AT TTT AAAAGGGTCT C C CAAT GT GAT T C CAC GGGCT CAC GGGCAGAAGAACAC G 60 



Db 


i 


Ov 


61 


Db 


61 




121 


Db 


121 


wy 


181 


Db 


181 




Z. H -L 


Db 


241 


yy 


O U X 


Db 


301 




O U X 


Db 


361 


yy 


421 


Db 


421 


Pitt 




Db 


481 


yy 


j *± 


Db 


541 


Qy 


601 


Db 


601 




661 


Db 


661 


Qy 


721 


Db 


721 


Qy 


781 


Db 


781 



| | | | M | I I I I I I I I I I I I I I I M I I I I M I I M I I I II I I I I II I I I I II I I I I I M I I 

AT GCAT ATT TT AAAAG GGT CT C C CAAT GT GAT T CCACGGGCT C AC GGGCAGAAGAACAC G 60 

CGAAGAGACGGAACTGGCCTCTATCCTATGCGAGGTCCCTTTAAGAACCTCGCCCTGTTG 120 

| M I I I I II I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
CGAAGAGACGG^ACTGGCCTCTATCCTATGCGAGGTCCCTTTAAGAACCTCGCCCTGTTG 12 0 

CCCTTCTCCCTCCCGCTCCTGGGCGGAGGCGGAAGCGGAAGTGGCGAGAAAGTGTCGGTC 180 

| | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

CCCTTCTCCCTCCCGCTCCTGGGCGGAGGCGGAAGCGGAAGTGGCGAGAAAGTGTCGGTC 180 

TCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGAGGCCGTGACGGCCAGA 240 

| | | | | | | | | || M I I I I I I I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I 
TCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGAGGCCGTGACGGCCAGA 240 

CTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCTGTTGCCACC 300 

| | | | | | | I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
CTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCTGTTGCCACC 300 

T CC GC CGGGGGC GAGGAGT C GCTT AAGT GCGAGGAC CT CAAAGT GGGACAAT AT AT TT GT 360 

I | | | | | | I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I M I I I I M I I I I I I I I I I 

TCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCAAAGTGGGACAATATATTTGT 360 

AAAGAT C CAAAAAT AAAT GAC GCT ACGCAAGAAC CAGT TAACTGT ACAAACT ACACAGCT 420 
| | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I II I I I I I I I I I I I I I I I I I 
AAAGATCCAAAAATAAATGACGCTACGCAAGAACCAGTTAACTGTACT^AACTACACAGCT 420 

CATGTTT CCT GTTTT CCAGCACCCAACATAACTTGTAAGGATTCCAGT GGCAAT GAAACA 4 80 

M I I I I I I I I I I I I I I I I I I I I I I I I M II I I I I I I I II I I I I I I I I M I I I I I I I I I I I 

CAT GTT T C CT GTT T T CC AGCAC C CAACAT AACT T GTAAGGATT C CAGT GGCAAT GAAACA 480 

CATTTTACTGGGAACGAAGTTGGTTTTTTCAAGCCCATATCTTGCCGAAATGTAAATGGC 540 

| || | | | || | | | I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

CAT T TT ACT GGGAACGAAGT T GGT TT T TTCAAGC C CAT AT CT T GCCGAAAT GTAAAT GGC 540 

TAT T C CTACAAAGT GGC AGT CGC AT T GT CT CTTT T T CTT GGAT GGTT GGGAGCAGAT CGA 600 
| | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 
TATTCCTACAAAGTGGCAGTCGCATTGTCTCTTTTTCTTGGAT GGTT GGGAGCAGAT CGA 600 

TTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGA 660 

| | | | M I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
TTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGA 660 

AT T GGGAGC CT AATT GATT T CATT CT T ATT T CAAT GCAGATT GT TGGAC CT TCAGAT GGA 720 
| I I I I I I I I I I I I II I I I I I I I I I I I I II I I I M II I I I I II I I I I I I I I I I I I I I I I I I 
AT T GGGAGC CT AATT GATT T CATT CTT ATT T CAAT GCAGAT T GTTGGACCT TCAGAT GGA 72 0 



AGT AGTTAC AT T ATAGATT ACT AT GGAAC CAGACT T ACAAGACT GAGTAT TACTAAT GAA 780 

I I I I M I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I M I I I I I I I I I I I I I I I 

AGT AGT T ACAT T ATAGATT ACT AT GGAAC CAGACT T ACAAGACT GAGTAT TACTAAT GAA 780 

AC AT T T AG AAAAAC G CAAT TAT AT C C AT AA 810 
I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 
ACAT T TAG AAAAAC G CAAT TAT AT C C AT AA 810 



RESULT 3 



AAD51940 

ID AAD51940 standard; cDNA; 810 BP. 
XX 

AC AAD51940; 
XX 

DT 02-MAY-2003 (first entry) 
XX 

DE Human BBP-1 cDNA. 
XX 

KW Human; beta-amyloid peptide-binding protein; BAP; Abeta; betaAP; BBP; 

KW Alzheimer's disease; AD; transgenic; transgenic animal; gene therapy; 

KW neuroprotective; nootropic; gene; ss. 
XX 

OS Homo sapiens. 
XX 

FH Key Location/Qualifiers 

FT CDS 1- -810 

FT /*tag= a 

FT /product^ "Human BBP-1" 

XX 

PN WO200290499-A2. 
XX 

PD 14-NOV-2002. 
XX 

PF 06-MAY-2002; 2002WO-US014223 . 
XX 

PR 09-MAY-2001; 2001US-00852100 . 
XX 

PA (AMHP ) WYETH. 
XX 

PI Ozenberger BA f Bard JA, Kajkowski EM, Jacobsen JS, Walker SG; 

PI Sofia HJ, Howland DS; 

XX 

DR WPI; 2003-120537/11. 

DR P-PSDB; AAE33877. 
XX 

PT New human beta-amyloid peptide-binding protein, useful for diagnosing 

PT and/or treating diseases associated with aberrant expression of beta- 

PT amyloid peptide, e.g. Alzheimer's disease. 
XX 

PS Claim 1; Page 82-84; 85pp; English. 
XX 

CC The present invention relates to novel human beta-amyloid peptide (BAP; 

CC Abeta, betaAP) -binding (BBP) proteins and polynucleotides encoding such 

CC proteins. BBP sequences are useful to diagnose and/or treat diseases 

CC associated with aberrant expression of human BAP such as Alzheimer's 

~~CC disease (AD)~They-a"re-used-to-generate-transgenic-anima-l-s— Sequences-of- 

CC the invention are also used in gene therapy. The present sequence is 

CC human BBP-1 cDNA 

XX 

SQ Sequence 810 BP; 204 A; 183 C; 202 G; 221 T; 0 U; 0 Other; 

Query Match 100.0%; Score 810; DB 7; Length 810; 
Best Local Similarity 100.0%; Pred. No. 2e-233; 

Matches 810; Conservative 0; Mismatches 0; Indels 0; Gaps 



Qy 



1 AT GCAT AT T T TAAAAGGGT CT C C CAAT GT GATT C C AC GGGCT C ACG G GCAGAAGAACACG 60 



Db 


1 


Ov 


61 


Db 


61 


Ov 


121 


Db 


121 


v!Y 


181 


Db 


181 


Ov 

viy 


241 


Db 


241 




301 


Db 


301 


wy 


361 


Db 


361 


wy 


421 


Db 


421 


wy 


481 


Db 


481 


viy 


541 


Db 


541 


Qy 


601 


Db 


601 


Qy 


661 


Db 


661 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

AT GC ATATT T T AAAAGGGT CT C CCAAT GT GAT T CCAC GGGCT C ACGGGC AGAAGAAC ACG 60 
CGAAGAGAC GGAACT GGC CT CT AT C CTAT GCGAGGT C CCTT T AAGAAC CT C GC C CT GT T G 12 0 

I | I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I II I I I I I I I I 

C GAAGAGAC GGAACT GGC CT CTAT C CT AT GCGAGGT CCCT T T AAGAACCT C GC CCT GTT G 120 

CCCTTCTCCCTCCCGCTCCTGGGCGGAGGCGGAAGCGGAAGTGGCGAGAAAGTGTCGGTC 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 
CCCTTCTCCCTCCCGCTCCTGGGCGGAGGCGGAAGCGGAAGTGGCGAGAAAGTGTCGGTC 180 

TCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGAGGCCGTGACGGCCAGA 240 

M | I I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

TCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGAGGCCGTGACGGCCAGA 240 

CTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCTGTTGCCACC 300 
M | I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I 
CTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCTGTTGCCACC 300 

T C C GC CGGG GGCGAGGAGT CGCT T AAGT GCGAGGAC CT CAAAGT GGGACAAT AT ATT TGT 360 

I I II I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I 

TCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCAAAGTGGGACAATATATTTGT 360 

AAAGATCCAAAAATAAAT GACGCTACGCAAGAACCAGTTAACT GTACAAACTACACAGCT 420 
I | | || | | | I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I M I I I I I I I I I I I I I I 
AAAGAT C CAAAAAT AAAT GAC GCT ACGCAAGAAC C AGT TAACT GTACAAACTACACAGCT 42 0 

CATGTTT CCT GTTTT CCAGCACCCAACAT AACTTGTAAGGATTCCAGT GGCAATGAAACA 4 80 

| | | | | | | I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I I I I I I I I M I I II I I II I I I 

CAT GTTTCCT GTTTT CCAGCACCCAACATAACTT GTAAGGATT CCAGT GGCAATGAAACA 480 

CATTTTACTGGGAACGAAGTTGGTTTTTTCAAGCCCATATCTTGCCGAAATGTAAATGGC 54 0 
| I I I I I I I I I I I I I I M I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
CATT T T ACTGGGAAC GAAGT T GGT T TT T T CAAGCC C AT AT CTTGC C GAAAT GT AAAT GGC 540 

TATTCCTACAAAGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGA 600 

| | M I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I M II I I I I I I I M I 

TAT T CCT ACAAAGT GGCAGT CGC AT T GT CT CTT T TT CTT GGATGGT T GGGAGCAGAT C GA 600 

TTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGA 660 

| M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I 
TTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGA 660 

AT T GGGAGC CT AATT GAT TT CAT T CTT ATTT CAAT GCAGATT GTT GGAC CT T CAGAT GGA 720 

M II I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

AT T GGGAGC CT AATT GATT T CAT T CTT ATT T CAAT GCAGAT T GTT GGACCT T CAGAT GGA 720 



Qy 721 AGTAGTTACATTATAGATTACTATGGAACCAGACTT ACAAGACT GAGTATTACTAAT GAA 780 

I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 
D b 721 AGTAGTTACATTATAGATTACTATGGAACCAGACTT ACAAGACT GAGTATTACTAAT GAA 780 

Qy 781 ACATTTAGAAAAACGCAATTATAT CCATAA 810 

I I I I I I I I I I I I I I I I I I I I I I I I I I M I I 
Db 781 AC AT T T AGAAAAAC GCAAT TAT AT C C AT AA 810 



RESULT 4 



AAD51979 

ID AAD51979 standard; DNA; 1246 BP. 
XX 

AC AAD51979; 
XX 

DT 02-MAY-2003 (first entry) 
XX 

DE Human BBP-1 genomic DNA. 
XX 

KW Human; beta-amyloid peptide-binding protein; BAP; Abeta; betaAP; BBP; 

KW Alzheimer f s disease; AD; transgenic; transgenic animal; gene therapy; 

KW neuroprotective; nootropic; ds . 
XX 

OS Homo sapiens. 
XX 

PN WO200290499-A2. 
XX 

PD 14-NOV-2002. 
XX 

PF 06-MAY-2002; 2002WO-US014223 . 
XX 

PR 09-MAY-2001; 2001US-00852100 . 
XX 

PA (AMHP ) WYETH. 
XX 

PI Ozenberger BA, Bard JA, Kajkowski EM, Jacobsen JS, Walker SG; 

PI Sofia HJ, Howland DS; 

XX 

DR WPI; 2003-120537/11. 
XX 

PT New human beta-amyloid peptide-binding protein, useful for diagnosing 

PT and/or treating diseases associated with aberrant expression of beta- 

PT amyloid peptide, e.g. Alzheimer's disease. 
XX 

PS Disclosure; Fig 11; 85pp; English. 
XX 

CC The present invention relates to novel human beta-amyloid peptide (BAP; 

CC Abeta, betaAP) -binding (BBP) proteins and polynucleotides encoding such 

CC proteins. BBP sequences are useful to diagnose and/or treat diseases 

CC associated with aberrant expression of human BAP such as Alzheimer's 

CC disease (AD). They are used to generate transgenic animals. Sequences of 

CC the invention are also used in gene therapy. The present sequence is 

CC human BBP-1 genomic DNA 
XX 

SQ Sequence 1246 BP; 318 A; 255 C; 283 G; 390 T; 0 U; 0 Other; 

Query Match !00t0%"; — Score-810-; DB— 7,- — Length— 1-2 4 6; 

Best Local Similarity 100.0%; Pred. No. 2.6e-233; 

Matches 810; Conservative 0; Mismatches 0; Indels 0; Gaps 0 

Qy 1 ATGC AT ATTT T AAAAGGGTCT C CCAAT GT GAT T C C AC GGGCT CACGGGC AGAAGAACAC G 60 

| | | | | | M | | | I I I I I I I I I II I I I I I I I I I I I I I I I M I I II II I I I i M I I I I I I I I I 
D b 118 ATGCAT ATTT T AAAAGGGTCT C CCAAT GT GATTC CAC GGGCT CAC GGGCAGAAGAACAC G 177 

61 CGAAGAGACGGAACTGGCCTCTATCCTATGCGAGGTCCCTTTAAGAACCTCGCCCTGTTG 120 

| | I I I I I I I I I I I I M I I I II I I II II I I II I M I I I I I I I I I I I I I M I I I I I I I I I I I 

178 CGAAGAGACGGAACTGGCCTCTATCCTATGCGAGGTCCCTTTAAGAACCTCGCCCTGTTG 237 



Qy 

Db 



Ov 


121 


Db 


238 


Ov 


181 


Db 


298 


Ov 
s^y 


241 


Db 


358 


Ov 


301 


Db 


418 


Ov 

wy 


361 


Db 


478 


Ov 

wy 


421 


Db 


538 


Ov 


481 


Db 


598 




541 


Db 


658 


0\7 

wy 


601 


Db 


718 


Ov 
vy 


661 


Db 


■"7 T O 


Qy 


721 


Db 


838 


Qy 


781 


Db 


898 



CCCTTCTCCCTCCCGCTCCTGGGCGGAGGCGGAAGCGGAAGTGGCGAGAAAGTGTCGGTC 180 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I 

CCCTTCTCCCTCCCGCTCCTGGGCGGAGGCGGAAGCGGAAGTGGCGAGAAAGTGTCGGTC 297 

TCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGAGGCCGTGACGGCCAGA 240 

| | | | M I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I M 

TCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGAGGCCGTGACGGCCAGA 357 

CTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCTGTTGCCACC 300 

| | || | | | | | I I I I I I I M I I I I I I I I I I I I I I I I I II I I I II I II I I I I I I I I M I I I I I 

CTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCTGTTGCCACC 417 

TCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCAAAGTGGGACAATATATTTGT 360 

| | | | | | | I || I I I I I I I I I I I I II I I I I I I I I I I I M I II I I I I I I I I I I I I I II I I I I I 
TCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCAAAGTGGGACAATATATTTGT 477 

AAAGATCCAAAAATAAATGACGCTACGC7VAGAACCAGTTAACTGTACAAACTACACAGCT 42 0 
| I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I 
AAAGAT CCAAAAATAAAT GACGCTACGCAAGAACCAGTTAACT GTACAAACTACACAGCT 537 

C ATGT TT C CT GTT TT CCAGCAC C CAACAT AACT T GTAAGGATT CC AGT GGCAAT GAAACA 48 0 
| | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 
CATGT TTCCTGTTTTC CAGCAC C CAACAT AACT T GT AAGGAT T CCAGT GGCAAT GAAACA 597 

C ATTTTACT GGGAAC GAAGTT GGTT T T TT CAAGCC CAT AT CTT GC C GAAAT GTAAAT GGC 54 0 
I | | | I I I || I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
CATT T TACT GGGAAC GAAGTT GGT TT TT T CAAGCC CAT AT CT T GC CGAAATGTAAAT GGC 657 

T ATT C CT ACAAAGT GGCAGT CGCAT T GT C T CT T TT T CT T GGAT GGT T GGGAGCAGAT CGA 600 
I I I I I II I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 
T ATT C CT ACAAAGT GGC AGTC GCATT GTCTCTTTTTCTT GGAT GGTTGGGAGCAGAT CGA 717 

TTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGA 660 

I | | | | I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I M I I I I I I I 

TTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGA 777 

ATT GGGAGCCTAATT GAT TT CATT CTTAT TT CAAT GCAGATT GTT GGAC CTT CAGAT GGA 720 
I I I I M I I I II I I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 
ATT GGGAGCCT AAT T GAT T T CATT CTT AT TT CAAT G CAGAT T GT T GGAC CT T CAGATGGA 837 

AGTAGTTACATTATAGATTACTATGGAACCAGACTTACAAGACTGAGTATTACTAATGAA 780 

| | | | || I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I M I I I I 
AGT AGTT ACATT AT AGAT TACT AT GGAAC CAGACT T ACAAGACT GAGT ATT ACTAATGAA 897 

ACATTT AGAAAAACGCAATTATAT CCATAA 810 

— t till I I I I I II I I til til It I I til I I I 

ACAT T T AGAAAAACGCAATTATAT C CATAA 927 



RESULT 5 
AAX97705 

ID AAX97705 standard; DNA; 970 BP. 
XX 

AC AAX97705; 
XX 

DT 13-SEP-1999 (first entry) 



XX 
DE 
XX 
KW 
KW 
KW 
KW 
KW 
XX 
OS 
XX 
PN 
XX 
PD 
XX 
PF 
XX 
PR 
PR 
PR 
PR 
XX 
PA 
XX 
PI 
XX 
DR 
DR 
XX 
PT 
XX 
PS 
XX 

cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 

~XX" 
SQ 



Extended human secreted protein coding sequence, SEQ ID NO. 270. 

Secreted protein; human; cytokine; cellular proliferation; cell movement; 
cellular differentiation; immune system regulator; anti-inflammatory; 
haematopoiesis regulator; tissue growth regulator; tumour inhibitor; 
reproductive hormone regulator; chemotaxis; chemokinesis ; gene therapy; 
genetic disease; ss. 

Homo sapiens . 

W09931236-A2. 



24-JUN-1999. 
17-DEC-1998; 

17-DEC-1997 

09- FEB-1998 
13-APR-1998 

10- AUG-1998 



98WO-IB002122. 

97US-0069957P. 
98US-0074121P. 
98US-0081563P. 
98US-0096116P. 



(GEST ) GENSET. 

Bougueleret L, Duclert A, Dumas Milne Edwards J; 

WPI; 1999-385906/32. 
P-PSDB; AAY36021. 

New isolated human secreted proteins. 
Claim 1; Page 346-347; 516pp; English. 

This sequence represents an extended human secreted protein coding 
sequence of the invention. The secreted proteins can be used in treating 
or controlling a variety of human conditions. The secreted proteins may 
act as cytokines or may affect cellular proliferation or differentiation 
or may act as immune system regulators, haematopoiesis regulators, tissue 
growth regulators, regulators of reproductive hormones or cell movement 
or have chemotactic/chemokinetic, receptor/ligand, anti-inflammatory or 
tumour inhibition activity. The DNAs can be used in forensic procedures 
to identify individuals or in diagnostic procedures to identify 
individuals having genetic diseases resulting from abnormal expression of 
the genes corresponding to the extended cDNAs . They are also useful for 
constructing a high resolution map of the human chromosomes. They can 
also be used for gene therapy to control or treat genetic diseases 

Sequence 970 BP; 267 A; 173 C; 199 G; 323 T; 0 U; 8 Other; 



Query Match 74.4%; 
Best Local Similarity 98.4%; 
Matches 624; Conservative 



Score 602.8; DB 2 
Pred. No. 6.7e-171 
5; Mismatches 3 



Length 970; 

Indels 2; Gaps 2; 



Qy 



Db 



177 GGTCTCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGAGGCCGTGACGGC 236 

| | | | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I M I I I I I I I 11 I I 1 I I I I I I 

2 GGTCTCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGKCTGCTCCGGAGGCCGTGACGGC 61 



Qy 


237 


Db 


62 


Qy 


297 


Db 


122 


Qy 


357 


Db 


182 


Qy 


417 


Db 


242 


Qy 


477 


Db 


302 


Qy 


537 


Db 


362 


Qy 


597 


Db 


420 


Qy 


657 


Db 


480 


Qy 


717 


Db 


540 


Qy 


111 


Db 


600 



CAGACTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCTGTTGC 296 

| | | | M I I I I I I I I I I I I I I I I I I I I I I M I I I I I I II I I II I I I I I I II I I I I I I I I M 
CAGACTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCTGTTGC 121 

CAC CT C C GC C GGGGGCGAGGAGT C GCT TAAGT GC GAGGAC CT CAAAGT GGGACAATAT AT 356 
| | | | | | | | I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 
CACCTC C GC C GGGGGC GAGGAGT CGCT TAAGT GC GAGGAC CT CAAAGT GGGACAATAT AT 181 

T T GT AAAGAT CCAAAAATAAAT GAC GCT ACGCAAGAAC CAGT T AACT GT ACAAACTACAC 416 

| I I M I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I 

T T GT AAAGAT C CAAAAATAAAT GACGCT AC GCAAGAACCAGTTAACT GT ACAAACTACAC 241 
AGCT CAT GTTTCCTGTTTTC CAGC AC C CAACAT AACT T GTAAGGATT C CAGT GGCAAT GA 476 

| I I I I I M I I I I I I I I I I I I I M I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I M I 

AGCT CAT GTTTCCTGTTTTC CAGCAC CCAACATAACT T GTAAGGATT C CAGT GGCAAT GA 301 

AACACATTTTACTGGGAACGAAGTTGGTTTTTTCAAGCCCATATCTTGCCGAAATGTAAA 536 

| | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I M I I I I I I I I 
AACACATTTTACTGGGAACGAAGTTGGTTTTTTCAAGCCCATATCTTGCCGAAATGTAAA 361 

TGGCTATTCCTACAAAGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGA 596 

| | M | | || I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

T GGCT AT T CCT ACAAT G- AGCAGT CGCA- T GT CT CT T TTT CTT GGAT GGT T GGGAG CAGA 419 

TCGATTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTG 65 6 

| I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I : I I I : I I I I I I I I I I I I I I : I 
TCGATTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAABTTT YGCACTGTAGGGTTTKG 479 

T GGAAT T GGGAGC CT AATT GAT TT CATTCT T ATTTCAAT GCAGATT GT T GGAC CTT CAGA 716 

| II I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 

T GGAAT T GGGAGC CT AAT T GAT T T CAT YCTTAT T T CAAT GCAGATT GT T GGAC CTT CAAA 539 

T GGAAGTAGT T ACATT AT AGATTACT AT GGAAC CAGACT T ACAAGACT GAGT AT TACT AA 776 
| | I I M I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I M I I I I II I I I I I I 
T GGAAGTAGTT ACATT AT AGATTACT AT GGAAC CAGACT TACAAGACT GAGT AT T ACT AA 599 

T GAAAC AT TT AGAAAAACGCAATT AT AT C CAT AA 810 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
T GAAACAT TT AGAAAAAC GCAAT T AT AT CCATAA 633 



RESULT 6 
AAA77946 

ID AAA77946 standard; cDNA; 508 BP. 
XX 

AC AAA77946; 
XX 

DT 14-NOV-2000 (first entry) 
XX 

DE cDNA encoding human colon tumour polypeptide, SEQ ID NO: 233. 
XX 

KW Human colon tumour polypeptide; tumour antigen; cancer; vaccine; 

KW immunotherapy; diagnosis; progression; ss. 

XX 

OS Homo sapiens. 
XX 

PN WO200037643-A2. 



PD 


29- 


JUN- 


2000 






XX 












PF 


23- 


DEC- 


1999; 99WO- 


US030909. 


XX 












PR 


23- 


DEC- 


1998, 


98US- 


00221298. 


PR 


02- 


JUL- 


1999, 


; 99US- 


00347496. 


PR 


22- 


SEP- 


1999, 


? 99US- 


00401064. 


PR 


19- 


NOV- 


1999, 


; 99US- 


00444242. 


PR 


02- 


DEC- 


1999, 


? 99US- 


00454150. 



XX 

PA (CORI-) CORIXA CORP. 
XX 

PI Xu J, Lodes MJ, Secrist H, Benson DR, Meagher MJ, Stolk J; 

PI Wang T, Yuqiu J; 

XX 

DR WPI; 2000-442671/38. 
XX 

PT New colon tumor polypeptides used to inhibit the development of cancer, 

PT especially colon cancer, and for diagnosing and monitoring the 

PT progression of the cancer. 
XX 

PS Claim 1; Page 158-159; 229pp; English. 
XX 

CC Sequences AAA77722-A78199 represent 478 cDNAs encoding proteins or 

CC portions of proteins which are associated with human colon tumours. The 

CC invention also specifically discloses 8 human colon tumour proteins 

CC (AAB11897-B11904) . The nucleic acids, the polypeptides they encode, and 

CC antigen presenting cells (APCs, preferably dendritic cells) expressing 

CC such polypeptides may be used in vaccines that target tumour cells, 

CC especially colon tumour cells, thereby inhibiting the development of 

CC cancer. T-cells specific for the polypeptide expressed by the APC are 

CC used to remove tumour cells from biological samples, especially blood or 

CC fractions thereof. The sample or the isolated T-cells specific for the 

CC polypeptide can then be used to inhibit cancer development. CD4 + and/or 

CC CD8+ T-cells from a patient may be incubated with a polypeptide or 

CC nucleic acid of the invention, or an APC expressing such a polypeptide, 

CC to cause the proliferation of specific T-cells. The T-cells can be cloned 

CC and then administered back to the patient to inhibit cancer development. 

CC Nucleic acids encoding the polypeptides and antibodies against the 

CC polypeptides may be used to determine the expression level of a tumour 

CC protein of the invention, and therefore to determine whether cancer cells 

CC are present. Such diagnostic methods may also be used to monitor the 

CC progression of a cancer by repeating the processes at time intervals, and 

CC comparing the current result to previous results. The present sequence 

CC represents a cDNA encoding a human colon tumour polypeptide 

XX 

SQ Sequence 508 BP; 153 A; 89 C; 103 G; 163 T; 0 U; 0 Other; 

Query Match 61.6%; Score 499; DB 3; Length 508; 

Best Local Similarity 100.0%; Pred. No. 9.4e-140; 

Matches 499; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 312 CGAGGAGT CGCTTAAGTGCGAGGACCT CAAAGTGGGACAATAT ATTT GTAAAGAT CCAAA 371 

I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I II I I M I I I I I I 
Db 1 CGAGGAGT C GCT TAAGT G C GAGGAC CT CAAAGT GGGACAATAT ATTT GTAAAGAT C CAAA 60 



Qy 372 AATAAATGACGCTACGCAAGAACCAGTT7^lACTGTACAAACTACACAGCTCATGTTTCCTG 431 

M I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I M I I I I I I 

Db 61 AATAAATGACGCTACGCAAGAACCAGTTAACT GTACAAACTACACAGCTCAT GTTT CCT G 120 

Qy 4 32 T T TT CCAGCAC CCAACAT AACT T GTAAGGAT T C CAGT GG CAAT GAAAC ACAT TTT ACT GG 4 91 

I I I I I I I I I I I I I I I I I I I I I I M I I I I M I I I I I I I I I I I II M I I I I I I II I I I I I I I 
Db 121 T T TT C CAGC ACC CAACATAACT T GTAAGGATT C C AGTGGCAAT GAAACACATT TT ACT GG 180 

Qy 4 92 GAACGAAGTT GGTT TTT T CAAGCC C AT AT CT T GC C GAAAT GTAAAT GGCTAT T CCT ACAA 551 

| | || | I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I II I I I M I I I I I M I I I I I I I 
Db 181 GAAC GAAGT T GGTTT TTT CAAGCC CAT AT CT T GC C GAAAT GTAAAT GGCTATT CCTACAA 240 

Qy 552 AGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGATTTTACCTTGG 611 

| I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I 

Db 241 AGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGATTTTACCTTGG 300 

Qy 612 ATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCT 671 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I II I I I I I I I M 
Db 301 ATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCT 360 

Qy 672 AATTGATTTCATTCTTATTT CAAT GCAGATTGTT GGACCTT CAGAT GGAAGTAGTTACAT 731 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 
Db 361 AATT GATTTCATT CTTATTTCAATGCAGATT GTT GGACCTTCAGATGGAAGTAGTTACAT 420 

Qy 732 T AT AGATT ACTAT GGAAC CAGACT T ACAAGACTGAGT ATT ACTAAT GAAACATTT AGAAA 791 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 
Db 421 T AT AGAT T ACTAT GGAAC CAGACTTACAAGACT GAGT AT TACTAAT GAAAC ATT T AGAAA 480 

Qy 792 AAC GCAATT AT AT C C AT AA 810 

I I I I I I I I I I I II I I I I II 
Db 4 81 AAC GCAAT TAT AT CC AT AA 4 99 



RESULT 7 
AAA77958 

ID AAA77 958 standard; cDNA; 508 BP. 
XX 

AC AAA77958; 
XX 

DT 14-NOV-2000 (first entry) 
XX 

DE cDNA encoding human colon tumour polypeptide, SEQ ID NO: 245. 
XX 

KW Human colon tumour polypeptide; tumour antigen; cancer; vaccine; 

KW immunotherapy; diagnosis; progression; ss. 

XX 

~~OS Homo - sapiens- 
XX 

PN WO200037643-A2. 
XX 

PD 29-JUN-2000. 
XX 

PF 23-DEC-1999; 99WO-US030909 . 
XX 

PR 23-DEC-1998; 98US-00221298 . 

PR 02-JUL-1999; 99US-00347496 . 

PR 22-SEP-1999; 99US-004 01064 . 



PR 19-NOV-1999; 99US-00444242 . 

PR 02-DEC-1999; 99US-00454150 . 
XX 

PA (CORI-) CORIXA CORP. 
XX 

PI Xu J, Lodes MJ, Secrist H, Benson DR, Meagher MJ, Stolk J; 

PI Wang T, Yuqiu J; 

XX 

DR WPI; 2000-442671/38. 
XX 

PT New colon tumor polypeptides used to inhibit the development of cancer, 

PT especially colon cancer, and for diagnosing and monitoring the 

PT progression of the cancer. 
XX 

PS Claim 1; Page 162; 22 9pp; English. 
XX 

CC Sequences AAA77722-A78199 represent 478 cDNAs encoding proteins or 

CC portions of proteins which are associated with human colon tumours. The 

CC invention also specifically discloses 8 human colon tumour proteins 

CC (AAB11897-B11904) . The nucleic acids, the polypeptides they encode, and 

CC antigen presenting cells (APCs, preferably dendritic cells) expressing 

CC such polypeptides may be used in vaccines that target tumour cells, 

CC especially colon tumour cells, thereby inhibiting the development of 

CC cancer. T-cells specific for the polypeptide expressed by the APC are 

CC used to remove tumour cells from biological samples, especially blood or 

CC fractions thereof. The sample or the isolated T-cells specific for the 

CC polypeptide can then be used to inhibit cancer development. CD4+ and/or 

CC CD8+ T-cells from a patient may be incubated with a polypeptide or 

CC nucleic acid of the invention, or an APC expressing such a polypeptide, 

CC to cause the proliferation of specific T-cells. The T-cells can be cloned 

CC and then administered back to the patient to inhibit cancer development. 

CC Nucleic acids encoding the polypeptides and antibodies against the 

CC polypeptides may be used to determine the expression level of a tumour 

CC protein of the invention, and therefore to determine whether cancer cells 

CC are present. Such diagnostic methods may also be used to monitor the 

CC progression of a cancer by repeating the processes at time intervals, and 

CC comparing the current result to previous results. The present sequence 

CC represents a cDNA encoding a human colon tumour polypeptide 

XX 

SQ Sequence 508 BP; 153 A; 89 C; 103 G; 163 T; 0 U; 0 Other; 

Query Match 61.6%; Score 499; DB 3; Length 508; 

Best Local Similarity 100.0%; Pred. No. 9.4e-140; 

Matches 499; Conservative 0; Mismatches 0; Indels 0; Gaps 0 
Qy 312 CGAGGAGT CGCTTAAGTGCGAGGACCT CAAAGT GGGACAATATATTT GTAAAGAT CCAAA 371 



1 CGAGGAGTCGCTTAAGTGCGAGGACCT CAAAGT GGGACAATATATTT GTAAAGATCCAAA 60 



372 AATAAAT GAC GCT ACGCAAGAAC CAGT T AACT GT ACAAACT AC ACAGCTCAT GT T T C CT G 431 
| | | | | | I I I I I I I I I I I II I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I M I I I I I I I I I 
61 AATAAAT GAC GCT ACGCAAGAAC CAGT T AACTGTACAAACT ACACAGCTCAT GTT T C CT G 120 

432 TT T TC C AGC ACCCAAC ATAACTT GTAAGGAT T C CAGT GGCAAT GAAACAC ATT T T ACT GG 491 

| | | | | | ; | | | I I I M I I I I I II I I I I I I II I I I I I I I I I I I I I I M II II M I I I I I M I I 
121 TT T T C C AGC ACCCAAC AT AACT T GTAAGGAT T C CAGT GGCAAT GAAACAC AT T T TACT GG 180 



Qy 


492 


Db 


181 


Qy 


552 


Db 


241 


Qy 


612 


Db 


301 


Qy 


672 


Db 


361 


Qy 


732 


Db 


421 


Qy 


792 


Db 


481 



GAACGAAGT TGGTTTTTT CAAGC C CAT AT CTT GC C GAAAT GTAAAT GGCT ATT C CT ACAA 551 
| | | | | | | | | | | | M I I I I I I I I M I I I I I I I I I I I II I I I I I I I I I I I I I I I I I M I M I 
GAACGAAGTT GGTTTTTT CAAGCCCATAT CTT GCC GAAAT GTAAAT GGCT ATT CCT ACAA 240 

AGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGATTTTACCTTGG 611 

| | | | | | | | | | | I I I I I I I M I I I I I I II I I I II I I I I I I I I M I I I I I I I I I I I I I M I I 
AGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGATTTTACCTTGG 300 

ATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCT 67 1 

| | | | M I I I I I I I I I I I I I M I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 
ATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCT 360 

AATT GATT T CAT T CT TAT T T CAAT GCAGATT GTTGGAC CT TCAGAT GGAAGT AGTT ACAT 731 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I M I I I I I I I I I I I I I I I M I I I 

AATT GATTT CATT CTTATTTCAATGCAGATT GTTGGACCTTCAGAT GGAAGT AGTTACAT 420 

T AT AGAT T ACT AT GGAAC CAGACTT ACAAGACT GAGT AT TACTAAT GAAACATTT AGAAA 791 

|| | | | II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I 

TAT AGAT TACT AT GGAAC CAGACTT ACAAGACT GAGT ATTACTAAT GAAACAT TT AGAAA 480 

AAC G CAAT TAT AT C C AT AA 810 
I I I I I II I I I I II I I I I I I 
AAC G CAAT TAT AT C C AT AA 4 99 



RESULT 8 
AAI28684 

ID AAI28684 standard; cDNA; 508 BP. 
XX 

AC AAI28684; 
XX 

DT 12-OCT-2001 (first entry) 
XX 

DE Colon tumour related determined cDNA sequence for clone 25275. 
XX 

KW Human; immunotherapy; diagnosis; colon cancer; colon tumour; immunogenic; 

KW gene therapy; vaccine; colonic cancer; ss. 

XX 

OS Homo sapiens . 
XX 

PN WO200149716-A2. 
XX 

PD 12-JUL-2001. 
XX 

PF 29-DEC-2000; 2000WO-US035596 . 
XX 

-PR 3 0-DEC-T9 9 9"; 99US-0 04-76296^ — 

PR 10-JAN-2000; 2000US-00480321 . 

PR 15-FEB-2000; 2000US-00504 629 . 

PR 06-MAR-2000; 2000US-00519444 . 

PR 19-MAY-2000; 2000US-00575251 . 

PR 29-JUN-2000; 2000US-00609448 . 

PR 28-AUG-2000; 2000US-00649811 . 
XX 

PA (CORI-) CORIXA CORP. 
XX 

PI Xu J, Lodes MJ, Secrist H, Benson DR, Meagher MJ, Stolk JA; 



PI King GE, Wang T, Jiang Y; 
XX 

DR WPI; 2001-441847/47. 
XX 

PT Colon tumor associated proteins and nucleic acids useful for the 

PT prevention, diagnosis and treatment of colonic cancer. 

XX 

PS Claim 2; Page 198; 472pp; English. 
XX 

CC The present invention describes colon tumour associated proteins (I) and 

CC the polynucleotides (II) that encode them. (I) have cytostatic activity. 

CC (I) and (II) can be used in gene therapy and vaccine production. (I) and 

CC (II) may be used in the prevention, diagnosis and treatment of diseases 

CC associated with inappropriate colon tumour associated protein (TCAP) 

CC expression, such as colonic cancer. For example, (I) and (II) may be used 

CC to treat disorders associated with decreased expression by rectifying 

CC mutations or deletions in a patient's genome that affect the activity of 

CC TCAPs by expressing inactive proteins or to supplement the patients own 

CC production of them. Additionally, (II) may be used to produce the TCAP 

CC proteins, by inserting the nucleic acids into a host cell culturing the 

CC cell to express the protein. (II) and its complementary sequences may 

CC also be used as DNA probes in diagnostic polymerase chain reaction (PCR) 

CC and hybridisation assays to detect and quantitate the presence of similar 

CC nucleic acids in samples, and therefore which patients may be in need of 

CC restorative therapy. (I) may also be used as antigens in the production 

CC of antibodies against TCAPs and in assays to identify modulators of TCAP 

CC expression and activity. Anti-(I) antibodies and antagonists may also be 

CC used to down regulate TCAP expression and activity. The anti-(I) 

CC antibodies may also be used as diagnostic agents for detecting the 

CC presence of TCAPs in samples (e.g. by enzyme linked immunosorbant assay 

CC (ELISA) ) . AAI28460 to AAI29512 and AAM24494 to AAM24523 represent 

CC nucleotide and amino acid sequences given in the exemplification of the 

CC present invention 
XX 

SQ Sequence 508 BP; 153 A; 89 C; 103 G; 163 T; 0 U; 0 Other; 



Query Match 61.6%; Score 499; DB 4; Length 508; 

Best Local Similarity 100.0%; Pred. No. 9.4e-140; 

Matches 499; Conservative 0; Mismatches 0; Indels 0; Gaps 0 

Qy 312 C GAGGAGT CGCT T AAGT GCGAGGAC CT CAAAGT GG GACAAT AT ATT T GTAAAGAT CCAAA 371 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 CGAGGAGT CGCTTAAGT GCGAGGACCT CAAAGTGGGACAATATATTTGTAAAGAT CCAAA 60 

Qy 372 AATAAATGACGCTACGCAAGAACCAGTTAACT GTACAAACTACACAGCT CAT GTTTCCT G 431 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I 

Db 6~1~~ AAT AAAT GAC GCTAC GCAAGAAC C AGT TAAGT GTACAAACTACACAGCT GAT GTT T CCT G— 1-2 0 

Qy 432 TTTT CCAGCACCCAACATAACTT GTAAGGATT CCAGTGGCAATGAAACACATTTTACT GG 491 

I I I I I I II I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
Db 121 TTTT CCAGCAC CCAAC AT AACT T GT AAGGAT T CCAGT GGCAAT GAAACACAT TTT ACT GG 18 0 

Qy 4 92 GAAC GAAGTT G GT T T T T T CAAGC CCAT AT CT T GC C GAAAT GT AAAT GGCT ATT C CT ACAA 551 

I I I I I I I I I I I I I II M I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 GAAC GAAGTT GGT T TT T T CAAGC C CATATCT T GC C GAAAT GTAAAT GG CT AT T C CTACAA 240 



Qy 



552 AGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGATTTTACCTTGG 611 



I I I I I I M I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I M I I II I I I I I I I I I I I 

Db 241 AGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGATTTTACCTTGG 300 

Qy 612 ATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCT 671 

| | | | | I I I M I I I I II I M I I I I I I I I I I I I I I I I I I I I I I I M I I I II I I I I I I I I I I I 
D b 301 ATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCT 360 

Qy 672 AAT T GAT T T CAT T CT TAT T T CAAT G CAGATT GT T GGACCTT CAGAT GGAAGT AGT TACAT 731 

| | | | | | || | | I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I 
Db 361 AATT GATTTCATTCTTATTT CAAT GCAGATT GTT GGACCTTCAGAT GGAAGT AGTTACAT 420 

Q y 732 T AT AGAT T AC TAT GGAAC CAGACT T ACAAGACT GAGT ATT ACT AAT GAAAC AT T T AGAAA 791 

| | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 TAT AGAT TACT AT GGAAC CAGACT TACAAGACTGAGT AT T ACTAAT GAAACAT T T AGAAA 480 

Qy 792 AAC GCAAT TAT AT C CAT AA 810 

I I I I I I I I I I I I I I I I I I I 
Db 4 81 AAC GCAAT TAT AT C CATAA 499 



RESULT 9 
AAI28696 

ID AAI28696 standard; cDNA; 508 BP. 
XX 

AC AAI28696; 
XX 

DT 12-OCT-2001 (first entry) 
XX 

DE Colon tumour related determined cDNA sequence for clone 25288. 
XX 

KW Human; immunotherapy; diagnosis; colon cancer; colon tumour; immunogenic; 

KW gene therapy; vaccine; colonic cancer; ss. 

XX 

OS Homo sapiens . 
XX 

PN WO200149716-A2. 
XX 

PD 12-JUL-2001. 
XX 

PF 29-DEC-2000; 2000WO-US035596 . 
XX 

PR 30-DEC-1999; 99US-00476296 . 

PR 10-JAN-2000; 2000US-00480321 . 

PR 15-FEB-2000; 2000US-00504629 . 

PR 06-MAR-2000; 2000US-00519444 . 

PR 19-MAY-2000; 2000US-00575251 . 

PR 2 9^JUN^2 0 0 0 ;— 2 0 0 0US=0 0 60 9448^ 

PR 28-AUG-2000; 2000US-00649811 . 
XX 

PA (CORI-) CORIXA CORP. 
XX 

PI Xu J, Lodes MJ, Secrist H, Benson DR, Meagher MJ, Stolk JA; 

PI King GE, Wang T, Jiang Y; 

XX 

DR WPI; 2001-441847/47. 
XX 

PT Colon tumor associated proteins and nucleic acids useful for the 



PT prevention, diagnosis and treatment of colonic cancer. 
XX 

PS Claim 2; Page 201; 472pp; English. 
XX 

CC The present invention describes colon tumour associated proteins (I) and 

CC the polynucleotides (II) that encode them. (I) have cytostatic activity. 

CC (I) and (II) can be used in gene therapy and vaccine production. (I) and 

CC (II) may be used in the prevention, diagnosis and treatment of diseases 

CC associated with inappropriate colon tumour associated protein (TCAP) 

CC expression, such as colonic cancer. For example, (I) and (II) may be used 

CC to treat disorders associated with decreased expression by rectifying 

CC mutations or deletions in a patient f s genome that affect the activity of 

CC TCAPs by expressing inactive proteins or to supplement the patients own 

CC production of them. Additionally, (II) may be used to produce the TCAP 

CC proteins, by inserting the nucleic acids into a host cell culturing the 

CC cell to express the protein. (II) and its complementary sequences may 

CC also be used as DNA probes in diagnostic polymerase chain reaction (PCR) 

CC and hybridisation assays to detect and quantitate the presence of similar 

CC nucleic acids in samples, and therefore which patients may be in need of 

CC restorative therapy. (I) may also be used as antigens in the production 

CC of antibodies against TCAPs and in assays to identify modulators of TCAP 

CC expression and activity. Anti-(I) antibodies and antagonists may also be 

CC used to down regulate TCAP expression and activity. The anti-(I) 

CC antibodies may also be used as diagnostic agents for detecting the 

CC presence of TCAPs in samples (e.g. by enzyme linked immunosorbant assay 

CC (ELISA) ) . AAI28460 to AAI29512 and AAM24494 to AAM24523 represent 

CC nucleotide and amino acid sequences given in the exemplification of the 

CC present invention 
XX 

SQ Sequence 508 BP; 153 A; 89 C; 103 G; 163 T; 0 U; 0 Other; 

Query Match 61.6%; Score 499; DB 4; Length 508; 

Best Local Similarity 100.0%; Pred. No. 9.4e-140; 

Matches 499; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 312 CGAGGAGT CGCTTAAGTGCGAGGACCTCAAAGTGGGACAATATATTTGTAAAGAT CCAAA 371 

I I I I I | | I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I > I M M I M 

Db 1 CGAGGAGTCGCTTAAGT GCGAGGACCT CAAAGTGGGACAATATATTT GTAAAGAT CCAAA 60 

Qy 372 AAT AAAT GAC GCT AC GCAAGAAC C AGT T AACT GT ACAAACT ACACAGCT CAT GTTT CCT G 431 

| M I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 
Db 61 AAT AAAT GACGCTACGCAAGAACCAGTTAACTGTACAAACTACACAGCT CAT GTTT CCTG 120 

Q y 4 32 T TT T CC AGC ACCCAAC AT AACTT GTAAGGAT T C C AGT GGCAAT GAAACAC AT TT T ACT GG 491 

I | | | | I I I I I I I I I I I I I I I I I I I II I I I I II I I II I I I I I I I I I I I II I I I I I I 

Db 121 TTTT CCAGCACCCAACAT AACTT GTAAGGATTCCAGTGGCAAT GAAACACATTTTACT GG 180 

Qy 492 GAACGAAGTTGGTTTTTTCT^AGCCCATATCTTGCCGAAATGTAAATGGCTATTCCTACAA 551 

I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I II I I I I I I I I II I I I I I I I.I II II I 
Db 181 GAACGAAGT T GGTTTTT TCAAGC C CAT AT CT TGC CGAAAT GTAAAT GGCT AT T C CT ACAA 240 

Qy 552 AGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGATTTTACCTTGG 611 

| | M | || || I I I I I I I I II I I I II I I I I I I I I I I I I I I I II I I I I I I M I I I M I I I I I I 
Db 241 AGT GG C AGT C GCAT T GT CT CT TTT TCT T GGAT GGTT GGGAGCAGAT C GAT T T T AC CTT GG 300 



Qy 



612 ATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCT 671 
| | | | || | | | | | | I I I II I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I M M I M I I 



D b 301 ATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCT 360 

Qy 672 AATT GATTT CAT T CT TAT T T CAAT GCAGATT GTT GGACCTT C AGAT GGAAGT AGTT ACAT 731 

| | I I M | I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M M N I 
Db 361 AATT GATTTCATT CTTATTT CAATGCAGATTGTT GGACCTTCAGATGGAAGTAGTTACAT 420 

Qy 732 T ATAGATTACTAT GGAACCAGACTTACAAGACT GAGTATTACTAAT GAAACATTTAGAAA 791 

| | | | | I I I I M I I I I I I I I I I I I I II I M I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 
D b 421 T AT AGATT ACT AT GGAAC CAGACTT ACAAGACT GAGT AT T ACTAAT GAAACATT T AGAAA 48 0 

Qy 792 AACGCAAT TAT AT C C ATAA 810 

I I I I I I I I I I I I I I I I M I 
Db 481 AACGCAATTATAT CCATAA 4 99 



RESULT 10 
ABZ32882 

ID ABZ32882 standard; cDNA; 508 BP. 
XX 

AC ABZ32882; 
XX 

DT 30-JAN-2003 (first entry) 
XX 

DE Human colon tumour cDNA clone 25288 SEQ ID NO: 245. 
XX 

KW Human; colon cancer; colon tumour; immunotherapy; diagnosis; cancer; 

KW tumour; immune response; immuno stimulant; cytostatic; vaccine; gene; ss. 

XX 

OS Homo sapiens. 
XX 

PN WO200283070-A2. 
XX 

PD 24-OCT-2002. 
XX 

PF 09-APR-2002; 2002WO-US011475 . 
XX 

PR 10-APR-2001; 2001US-00833263 . 

PR 03-AUG-2001; 2001US-00922217 . 

PR 19-DEC-2001; 2001US-00025380 . 
XX 

PA (CORI-) CORIXA CORP. 
XX 

PI Xu J, Lodes MJ, Secrist H, Benson DR, Meagher MJ, Stolk JA; 

PI Wang T, Jiang Y, Smith CL, King GE, Wang A, Clapper JD, Skeiky YAW; 

PI Fanger GR, Vedvick TS, Carter D; 

XX 

"DR WPIT" 2003^0675-48/06- 

XX 

PT New polynucleotide, useful for the preparation of a composition for 

PT stimulating an immune response against, or treating, cancer. 

XX 

PS Example 1; Page 204; 537pp; English. 
XX 

CC The present invention describes compounds (I) for the immunotherapy and 

CC diagnosis of colon cancer. Also described: (1) a method for detecting the 

CC presence of cancer in a patient; (2) a method for stimulating and/or 

CC expanding T cells specific for a tumour protein; (3) an isolated T cell 



CC population comprising T cells prepared by the method of (2); (4) a method 

CC for stimulating an immune response in a patient; (5) a method for 

CC treating cancer in a patient; and (6) a method for inhibiting the 

CC development of cancer in a patient. (I) have immuno stimulant and 

CC cytostatic activities and can be used in vaccines. ABZ32646 to ABZ33725 

CC and ABP55343 to ABP55391 represent human colon cancer/tumour related 

CC sequences used in the exemplification of the present invention 

XX 

SQ Sequence 508 BP; 153 A; 89 C; 103 G; 163 T; 0 U; 0 Other; 

Query Match 61.6%; Score 499; DB 7; Length 508; 

Best Local Similarity 100.0%; Pred. No. 9.4e-140; 

Matches 499; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

312 CGAGGAGTCGCTTAAGT GCGAGGACCT CAAAGT.GGGACAATATATTTGTAAAGAT CCAAA 371 
| | I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 
1 CGAGGAGT CGCTTAAGT GCGAGGACCT CAAAGTGGGACAATATATTTGTAAAGATCCAAA 60 

372 AATAAATGACGCTACGCAAGAACCAGTTAACT GTACAAACTACACAGCTCAT GTTT CCTG 431 
I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
61 AATAAATGACGCTACGCAAGAACCAGTTAACTGT ACAAACTACACAGCTCATGTTT CCT G 120 

432 T T T T C C AG C AC C C AAC AT AAC T T GT AAG GAT T C C AGT G GC AAT GAAAC AC AT T T T AC T G G 4 91 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I 
121 TTTTCCAGCACCCAACATAACTTGTAAGGATTCCAGTGGCAATGAAACACATTTTACTGG 180 

492 GAAC GAAGTT GGT TT TT T CAAGCC CAT ATCT T GC C GAAATGTAAAT GGCT AT T C CTACAA 551 

I IN I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I 
181 GAAC GAAGTT GGTT T TT T CAAGC CCAT AT CTT GC C GAAAT GTAAAT GG CT AT TC CTACAA 240 

552 AGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGATTTTACCTTGG 611 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I 
241 AGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGATTTTACCTTGG 300 

612 ATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCT 671 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I 
301 ATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCT 360 

672 AATT GAT TT C ATT CTT AT TT CAAT GCAGAT T GT T GGAC CTT CAGAT GGAAGT AGT TAC AT 731 

I I I I I I I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
361 AATT GATTT CATTCTTATTTCAATGCAGATTGTT GGAC CTT CAGAT GGAAGT AGTTACAT 420 

732 T AT AGAT TACT AT GGAAC C AGACT T ACAAGACT GAGT AT TACT AAT GAAACAT T T AGAAA 791 

I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
421 TATAGATTACTATGGAACCAGACTTACAAGACTGAGTATTACTAATGAAACATTTAGAAA 4 80 

Qy 792 AAC GC AATT AT AT C CATAA 810 

I I I I I I I I I I I I I I I I I I I 
Db 481 AAC GCAATT AT AT C CATAA 499 



RESULT 11 
ABZ32870 

ID ABZ32870 standard; cDNA; 508 BP. 
XX 

AC ABZ32870; 
XX 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



DT 30-JAN-2003 (first entry) 
XX 

DE Human colon tumour cDNA clone 25275 SEQ ID NO: 233. 
XX 

KW Human; colon cancer; colon tumour; immunotherapy; diagnosis; cancer; 

KW tumour; immune response; immuno stimulant; cytostatic; vaccine; gene; ss. 

XX 

OS Homo sapiens . 
XX 

PN WO200283070-A2. 
XX 

PD 24-OCT-2002. 
XX 

PF 09-APR-2002; 2002WO-US011475 . 
XX 

PR 10-APR-2001; 2001US-00833263 . 

PR 03-AUG-2001; 2001US-00922217 . 

PR 19-DEC-2001; 2001US-00025380 . 
XX 

PA (CORI-) CORIXA CORP. 
XX 

PI Xu J, Lodes MJ, Secrist H, Benson DR, Meagher MJ, Stolk JA; 

PI Wang T, Jiang Y, Smith CL, King GE, Wang A, Clapper JD, Skeiky YAW; 

PI Fanger GR, Vedvick TS, Carter D; 

XX 

DR WPI; 2003-067548/06. 
XX 

PT New polynucleotide, useful for the preparation of a composition for 

PT stimulating an immune response against, or treating, cancer. 

XX 

PS Example 1; Page 201; 537pp; English. 
XX 

CC The present invention describes compounds (I) for the immunotherapy and 

CC diagnosis of colon cancer. Also described: (1) a method for detecting the 

CC presence of cancer in a patient; (2) a method for stimulating and/or 

CC expanding T cells specific for a tumour protein; (3) an isolated T cell 

CC population comprising T cells prepared by the method of (2); (4) a method 

CC for stimulating an immune response in a patient; (5) a method for 

CC treating cancer in a patient; and (6) a method for inhibiting the 

CC development of cancer in a patient. (I) have immunostimulant and 

CC cytostatic activities and can be used in vaccines. ABZ32646 to ABZ33725 

CC and ABP55343 to ABP55391 represent human colon cancer/tumour related 

CC sequences used in the exemplification of the present invention 

XX 

SQ Sequence 508 BP; 153 A; 89 C; 103 G; 163 T; 0 U; 0 Other; 

Query^Ma ten 6-1—6%-; — S co r e-4 9 9 ; DB— 7-; Lengfeh-5 0 8-;- 

Best Local Similarity 100.0%; Pred. No. 9.4e-140; 

Matches 499; Conservative 0; Mismatches 0; Indels 0; Gaps 0 



312 CGAGGAGT CGCTTAAGT GCGAGGACCT CAAAGTGGGACAAT ATATTTGTAAAGAT CCAAA 371 




Db 



1 CGAGGAGT CGCTTAAGT G CGAGGAC CT CAAAGT GGGACAAT ATATTT GT AAAGAT C CAAA 60 



Qy 



372 AAT AAAT GAC GCT ACGCAAGAAC C AGT T AACT GT ACAAACT ACAC AGCT CAT GT T T C CT G 431 




Db 



61 AAT AAAT GAC GCTACG CAAGAAC C AGT T AACT GT ACAAACT ACACAGCT CAT GT T T C CT G 120 



Qy 


432 


Db 


121 


Ov 


492 


Db 


181 


Ov 


552 


Db 


241 


Ov 


612 


Db 


301 


Ov 


672 


Db 


361 


Qy 


732 


Db 


421 


Qy 


792 


Db 


481 



T T TT C CAGCAC C CAACATAACTT GT AAG GATT C CAGT GGCAAT GAAACACATTT T ACTGG 491 

| | | | | I I I I I I I I I I I M I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I 

T TTT C CAG CAC C CAACATAACT T GTAAGGAT T C CAGT GG CAAT GAAAC ACAT T TT ACT G G 18 0 

GAACGAAGT TGGTTTTTT CAAGCC C AT AT CT T GC C GAAAT GTAAAT GGCT ATT CCT ACAA 551 
| | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I M I I I 



AGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGATTTTACCTTGG 611 

| | I I I I I I I I I I II I I I I I I I I I I I I I I I I M I I I I I I M I I I I I I I I I I I I I I I I I I I I 
AGT GGCAGT C GCATT GT CTCTTTTTCTT GGAT GGT T GGGAGCAGAT CGAT T T TAC CT T GG 300 

ATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCT 671 

| | M I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
ATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCT 360 

AATTGATTTCATT CTTATTTCAATGCAGATT GTT GGACCTTCAGAT GGAAGTAGTTACAT 731 

| I M II I I I I I I I I I I I I II I I I I I I I I I II I I I I I I I I I I I I I I II I 

AATT GAT T T CATT CTT AT TT CAAT GCAGATT GTT GGAC CT T CAGAT GGAAGTAGTTACAT 420 

TATAGATTACTATGGAACCAGACTTACAAGACTGAGTATTACTAATGAAACATTTAGAAA 7 91 

| | | I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I M II I I I I I I I I I I I I I I I I I I I I I I 

TAT AGATTACT AT GGAAC CAGACTT ACAAGACT GAGT ATT ACTAAT GAAAC ATT T AGAAA 4 80 

AACGCAATTATATCCATAA 810 

I II I I I I I I I I I I I I I I I I 
AACGCAATTATATCCATAA 4 99 



RESULT 12 
ABK52558 

ID ABK52558 standard; cDNA; 1095 BP. 
XX 

AC ABK52558; 
XX 

DT 13-AUG-2002 (first entry) 
XX 

DE cDNA encoding RNA polymerase II subunit 11. 
XX 

KW RNA polymerase II subunit 11; ss; gene; cancer; HIV; infection; 

KW human immunodeficiency virus . 

XX 

OS Unidentified. 
XX 

FH Key Location/Qualifiers 



"FT CDS 1-2^— r-3-1-4- 

FT /*tag= a 

FT /product= "RNA polymerase II subunit 11" 

XX 

PN CN1331300-A. 
XX 

PD 16-JAN-2002. 
XX 

PF 30-JUN-2000; 2000CN-00116963 . 
XX 

PR 30-JUN-2000; 2000CN-00116963 . 



Polypeptide-RNA polymerase II subunit 11 and polynucleotide for coding 



XX 

PA (BODE-) BODE GENE DEV CO LTD SHANGHAI. 
XX 

PI Mao Y, Xie Y; 
XX 

DR WPI; 2002-340664/38. 

DR P-PSDB; AAU97631. 
XX 
PT 

PT it. 

XX 

PS Claim 6; Page 28-29; 32pp; Chinese. 
XX 

CC This invention relates to the DNA and protein sequences of a novel 

CC polypeptide-RNA polymerase II subunit 11 protein. The invention also 

CC comprises a process for preparing the polypeptide of the invention by DNA 

CC recombination, the application of the polypeptide in treating diseases 

CC such as cancer, human immunodeficiency virus (HIV) infection, etc, the 

CC antagonist of the polypeptide and its medical action, and the application 

CC of the said polynucleotide are disclosed. The present sequence represents 

CC the cDNA sequence encoding the RNA polymerase II subunit 11 protein of 

CC the invention 

XX 

SQ Sequence 1095 BP; 268 A; 230 C; 244 G; 353 T; 0 U; 0 Other; 

Query Match 54.6%; Score 442.2; DB 6; Length 1095; 

Best Local Similarity 81.1%; Pred. No. 1.9e-122; 

Matches 596; Conservative 0; Mismatches 38; Indels 101; Gaps 3; 

GGTCTCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGAGGCCGTGACGGC 236 

I I I M I I II I I I I I I I I I I M I I I I I I I I I I I I M I I M M M I I I I I I I I I I M I II I 

GGTCTCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGATGCCGTGACGGC 61 

CAGACTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCTGTTGC 296 

| H I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 
CAGACTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCTGTTGC 121 

CAC CT C C GC C GGGGGC GAGGAGT C GCT TAAGT GC GAGGAC CT CAAAGT GGGACAAT AT - A 355 

M I'M | || | I I I II I I I I I I M I I I I I I I I I I I I I II M I I I I M I I I I I I I I I I I I I 

CAC CTCCGCCGGGGGC GAGGAGT C GCT TAAGT GC GAGGAC CT CAAAGT GGGACAAT AT CC 181 

TTT GT AAAGAT CCAAAAAT AAAT GACGCTA CGCAAGAAC CAGT TAA 401 

I I I I I I I I I I I I I I Ml II I 



Qy 


177 


Db 


2 


Qy 


237 


Db 


62 


Qy 


297 


Db 


122 


QY 


356 


Db 


182 


Qy 


402 


Db 


242 


Qy 


423 


Db 


302 


Qy 


436 


Db 


362 



422 



CT GTACAAACTACACAGCT C A 

-|-|-| 1 1-ITITITI — 

CAGTGGCACGATCTCAGCTCACTGCAGCCTCCGCCTCCCGGGTTCAGTCAATTCTCCTGC 301 

TGTTTCCTGTTTT 435 

I I I I I II I I I I I 

CTCAGCCTCCTGAGTAGCTGGGACTACAGGCATGCGCCACCACACCCGGTTTCCTGTTTT 361 

CCAGCACCCAACATAACTT GTAAGGATTCCAGT GGCAAT GAAACACATTTTACT GGGAAC 495 

| | | | | | | || | | I I I I II I I I I II I II I I I I I I I I I I I I I I I I M I M I I I I I I I I I I I I I 

CCAGCACCCAACATAACTT GTAAGGATT CCAGT GGCAAT GAAACACATTTTACT GGGAAC 421 



Qy 


496 


Db 


422 


Qy 


556 


Db 


482 


Qy 


616 


Db 


542 


Qy 


676 


Db 


602 


Qy 


736 


Db 


662 


Qy 


796 


Db 


722 



GAAGTT GGT t TTT T CAAGC C CAT AT CT T GC C GAAAT GTAAAT GGCT AT T C CTACAAAGT G 555 

I | | | I I I I I M I I I I I I II I I I I I I I I I I I I I I I I I I M I I I I I I I I I M I I I I I I I I M 

GAAGT T GGTT TT T T CAAGCC CAT AT CT TGC CGAAAT GTAAAT GGCT ATT C CTACAAAGT G 481 

GCAGT C GC AT T GT CT CTTT T TCTT GGATGGT T GGGAGCAGAT CGAT TT T ACCT T GGATAC 615 
| | | M I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I 
GCAGT C GCATT GT CT CT T T T T CT TGGAT GGTT GGGAGCAGAT C GAT T T T AC CT T GGATAC 541 

CCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCTAATT 675 

| I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II I I I I I I 
CCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCTAATT 601 

GATTTCATT CTT ATTT CAATGCAGATT GTT GGACCTTCAGATGGAAGTAGTTACATTATA 735 
I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
GATTT CATTCTTATTT CAAT GCAGATT GTT GGACCTTCAGAT GGAAGTAGTTACATTAT A 661 

GATT ACTAT GGAACC AGACT T ACAAGACT GAGT AT T ACT AAT GAAACATTTAGAAAAACG 795 
| | | | I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
GATT ACT AT GGAAC CAGACT T ACAAGACT GAGT ATTACTAAT GAAACATT TAGAAAAAC G 721 

CAAT TAT AT C C AT AA 810 

I I I I I I I I I I I I I I I 

C AATT AT AT C C AT AA 736 



RESULT 13 
AAX41191 

ID AAX41191 standard; cDNA; 440 BP. 
XX 

AC AAX41191; 
XX 

DT 17-JUN-1999 (first entry) 
XX 

DE Human secreted protein 5 f EST SEQ ID NO: 135. 
XX 

KW Human; secreted protein; EST; expressed sequence tag; diagnosis; 

KW forensic; gene therapy; chromosome mapping; signal peptide; 

KW upstream regulatory sequence; cytokine activity; cell proliferation; 

KW differentiation; haematopoiesis regulation; tissue growth regulation; 

KW reproductive hormone regulation; chemotactic; chemokinetic; haemostatic; 

KW thrombolytic; anti-inflammatory; tumour inhibition; ds . 

XX 

OS Homo sapiens. 
XX 

PN WO9906548-A2 . 
XX 

-PD 11=FEB^19 9 9 ~ ~ 

XX 

PF 31-JUL-1998; 98WO-IB001222 . 
XX 

PR 01-AUG-1997; 97US-00905135 . 
XX 

PA (GEST ) GEN SET. 
XX 

PI Dumas Milne Edwards J, Duclert A, Lacroix B; 
XX 

DR WPI; 1999-153778/13. 



DR P-PSDB; AAY12358. 
XX 

PT New nucleic acids encoding human secreted proteins - obtained from cDNA 

PT libraries prepared from e.g. liver, ovary; brain, prostate, kidney, lung, 

PT umbilical cord, placenta and colon tissue. 
XX 

PS Claim 1; Page 315; 824pp; English. 
XX 

CC AAX41094 to AAX41347 represent 5 1 expressed sequence tags (ESTs) for 

CC human secreted proteins, and encode the proteins given in AAY12261 to 

CC AAY12514, respectively. The proteins given represent the signal peptide 

CC and an N-terminal fragment of a secreted protein. The nucleic acid 

CC sequences can be used for producing secreted human gene products. They 

CC can also be used to develop products for diagnosis and therapy. The 

CC proteins obtained may have cytokine activity, cell 

CC proliferation/differentiation activity, haematopoiesis regulating 

CC activity, tissue growth regulating activity, reproductive hormone 

CC regulating activity, chemotactic/ chemokinetic activity, haemostatic and 

CC thrombolytic activity, receptor/ ligand activity, anti-inflammatory 

CC activity, tumour inhibition activity or other activities. The products 

CC can be used in forensic, gene therapy and chromosome mapping procedures. 

CC The sequences can also be used for obtaining corresponding promoter 

CC sequences. The nucleic acids encoding the signal peptide can be used for 

CC directing extracellular secretion of a polypeptide or the insertion of a 

CC polypeptide into a membrane, or importing a polypeptide into a cell 

XX 

SQ Sequence 440 BP; 107 A; 103 C; 114 G; 114 T; 0 U; 2 Other; 

Query Match 53.9%; Score 436.8; DB 2; Length 440; 

Best Local Similarity 99.5%; Pred. No. 4.9e-121; 

Matches 436; Conservative 2; Mismatches 0; Indels 0; Gaps 0; 

GAGAAAGTGTCGGTCTCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGAG 225 

I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I M I M I :: I I I I I I I I I I I I I I I I I I I I I 
GAGAAAGTGTCGGTCTCCAAGATGGCGGCCGCCTGGCSDTCTGGTCCGTCTGCTCCGGAG 62 

GCCGTGACGGCCAGACTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGG 285 
I I I I I II I I I I I I I I I I I I I II I I I I I I I II I I II I I I I I I M I I I I I I I I I I I I I I I I I 



I I I I I M I I I I I II I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 



GGACAATAT ATTT GTAAAGATCCAAAAATAAAT GACGCTACGCAAGAACCAGTTAACT GT 4 05 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I 

-GGACAATATATTTGTAAAGATeeAAAAATAAATGAeGeTACGeAAGAAGGAGT-T-AACT-GT— 2 4 2- 



QY 


166 


Db 


3 


Qy 


226 


Db 


63 


Qy 


286 


Db 


123 


Qy 


346 


Db 


183 


Qy 


406 


Db 


243 


Qy 


466 


Db 


303 



ACAAACT AC ACAGCT C AT GTT T C CT GT TT T C C AGCAC CCAAC AT AACT T GTAAGGATT CC 4 65 

I I M I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

ACAAACTACACAGCT CAT GTTTCCTGTTTTC CAGCAC CCAACATAACTT GT AAGGAT T CC 302 

AGT GGCAAT GAAACAC ATT T T ACT GGGAAC GAAGTT GGTT TT T T CAAGC CCAT AT CTT GC 525 
I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
AGT GGCAAT GAAACAC AT T TT ACTG GGAAC GAAGTT GGT t T T T T CAAGC C CAT ATCT T GC 362 



Qy 



52 6 CGAAATGTAAATGGCTATTCCTACAAAGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGG 585 



Db 363 CGAAATGTAAATGGCTATTCCTACAAAGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGG 422 

Qy 586 T T GGGAG CAGAT C GAT TT 603 

I I I I I I I I I I I I I I I I I I 
Db 423 TTGGGAGCAGATCGATTT 440 



RESULT 14 
AAX41259 

ID AAX41259 standard; cDNA; 455 BP. 
XX 

AC AAX41259; 
XX 

DT 17-JUN-1999 (first entry) 
XX 

DE Human secreted protein 5' EST SEQ ID NO: 2 03. 
XX 

KW Human; secreted protein; EST; expressed sequence tag; diagnosis; 

KW forensic; gene therapy; chromosome mapping; signal peptide; 

KW upstream regulatory sequence; cytokine activity; cell proliferation; 

KW differentiation; haematopoiesis regulation; tissue growth regulation; 

KW reproductive hormone regulation; chemotactic; chemokinetic; haemostatic; 

KW thrombolytic; anti-inflammatory; tumour inhibition; ds . 

XX 

OS Homo sapiens . 
XX 

PN WO9906548-A2 . 
XX 

PD ll-FEB-1999. 
XX 

PF 31-JUL-1998; 98WO-IB001222 . 
XX 

PR 01-AUG-1997; 97US-00905135 . 
XX 

PA (GEST ) GENSET. 
XX 

PI Dumas Milne Edwards J, Duclert A, Lacroix B; 
XX 

DR WPI; 1999-153778/13. 

DR P-PSDB; AAY12426. 
XX 

PT New nucleic acids encoding human secreted proteins - obtained from cDNA 

PT libraries prepared from e.g. liver, ovary, brain, prostate, kidney, lung, p 

PT umbilical cord, placenta and colon tissue. 

XX 

"PS Cla-im-1-;— Page-456-;-824pp;-Engl-i-sh^ 

XX 

CC AAX41094 to AAX41347 represent 5 ! expressed sequence tags (ESTs) for 

CC human secreted proteins, and encode the proteins given in AAY12261 to 

CC AAY12514, respectively. The proteins given represent the signal peptide 

CC and an N-terminal fragment of a secreted protein. The nucleic acid 

CC sequences can be used for producing secreted human gene products. They 

CC can also be used to develop products for diagnosis and therapy. The 

CC proteins obtained may have cytokine activity, cell 

CC proliferation/differentiation activity, haematopoiesis regulating 

CC activity, tissue growth regulating activity, reproductive hormone 



CC regulating activity, chemo tactic/ chemokinetic activity, haemostatic and 

CC thrombolytic activity, receptor/ ligand activity, anti-inflammatory 

CC activity, tumour inhibition activity or other activities. The products 

CC can be used in forensic, gene therapy and chromosome mapping procedures. 

CC The sequences can also be used for obtaining corresponding promoter 

CC sequences. The nucleic acids encoding the signal peptide can be used for 

CC directing extracellular secretion of a polypeptide or the insertion of a 

CC polypeptide into a membrane, or importing a polypeptide into a cell 

XX 

SQ Sequence 455 BP; 102 A; 107 C; 115 G; 122 T; 0 U; 9 Other; 

Query Match 52.5%; Score 425.2; DB 2; Length 455; 

Best Local Similarity 96.3%; Pred. No. 1.6e-117; 

Matches 439; Conservative 3; Mismatches 12; Indels 2; Gaps 1; 

GGTCTCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGAGGCCGTGACGGC 236 

I | | | I I I M II I I I II I I I I I I M I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I M I I 
GGTCTCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGKCTGCTCCGGAGGCCGTGACGGC 61 

CAGACTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCTGTTGC 296 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
CAGACTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCTGTTGC 121 

CACCTCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCAAAGTGGGACAATATAT 356 
I I I M I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 
CACCT CCGCCGGGGGC GAGGAGT C GCT TAAGT GC GAGGACCT CAAAGT GGGACAAT AT AT 181 

T T GT AAAGAT C CAAAAATAAAT GACGCTACGCAAGAAC CAGTTAACTGT ACAAACT ACAC 416 
M I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
T T GT AAAGAT C CAAAAATAAAT GACGCTACGCAAGAAC CAGT T AACT GT ACAAACT ACAC 241 

AGCT CAT GTTTCCTGTTT T CCAGCAC C CAACAT AACTT GTAAGGATT C CAGT GGCAAT GA 47 6 
I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I 
AGCT CAT GTTTCCTGTTTTC CAGCAC CCAACATAACT T GTAAGGATNC CAGT GGCAAT GA 301 

AACACAT T T TACT GGGAAC GAAGT TGGTT T TTT CAAGC CCATAT CT T GCCGAAAT GTAAA 536 

I I I I II I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I M I I 

AACACATTT TACT GGGAAC GAAGTT GGT TTT TT CAAGC CCATAT CTTGC CGAAAT GTAAA 361 

TGGCTATTCCTACAAAGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGA 596 
| | I I I I I I I I I I I I : : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

TGGCTATTCCTAC — NNTKAGCAGTNNNWTGTCTCTTTTTCTTGGATGGTTGGGAGCAGA 419 

TCGATTTTACCTTGGATACCCTGCTTTGGGTTTGTT 632 
I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 
TCGATTTTACCTTGGATACCCTGCTTTGGGTTTGTT 4 55 



Ov 


177 


Db 


2 


Ov 


237 


Db 


62 


Ov 


297 


Db 


122 


Qy 


357 


Db 


182 


Qy 


417 


Db 


242 


Qy 


477 


Db 


302 


Qy 


537 


Db 


362 


Qy 


597 


Db 


420 



RESULT 15 
AAC04131 

ID AAC04131 standard; cDNA; 487 BP. 
XX 

AC AAC04131; 
XX 

DT 06-OCT-2000 (first entry) 
XX 

DE Human secreted protein 5' EST, SEQ ID NO: 8206. 



New nucleic acid that is a 5' expressed sequence tag (5* EST) for 
obtaining cDNAs and genomic DNAs that correspond to 5 1 ESTs and for 
diagnostic, forensic, gene therapy and chromosome mapping procedures, 



XX 

KW Human; 5' EST; expressed sequence tag; secreted protein; cDNA isolation; 

KW gene therapy; chromosome mapping; ss. 
XX 

OS Homo sapiens. 
XX 

PN EP1033401-A2. 
XX 

PD 06-SEP-2000. 
XX 

PF 21-FEB-2000; 2000EP-00200610 . 
XX 

PR 26-FEB-1999; 99US-01224 87P . 
XX 

PA (GEST ) GENSET. 
XX 

PI Dumas Milne Edwards J, Duclert A, Giordano J; 
XX 

DR WPI; 2000-500381/45. 
XX 
PT 
PT 
PT 
XX 

PS Claim 1; SEQ ID NO 8206; 71pp + Sequence Listing; English. 
XX 

CC The present sequence is one of a large number of 5' ESTs derived from 

CC mRNAs encoding secreted proteins. No ORF has yet been conclusively 

CC identified within the present sequence. The 5 1 ESTs were prepared from 

CC total human RNAs or polyA+ RNAs derived from 30 different tissues. EST 

CC sequences usually correspond mainly to the 3' untranslated region (UTR) 

CC of the mRNA because they are often obtained from oligo-dT primed cDNA 

CC libraries. Such ESTs are not well suited for isolating cDNA sequences 

CC derived from the 5' ends of mRNAs and even in those cases where longer 

CC cDNA sequences have been obtained, the full 5' UTR is rarely included. 5' 

CC ESTs are derived from mRNAs with intact 5 ! ends and can therefore be used 

CC to obtain full length cDNAs and genomic DNAs. 5' ESTs are also used in 

CC diagnostic, forensic, gene therapy and chromosome mapping procedures. 

CC They are used to obtain upstream regulatory sequences and to design 

CC expression and secretion vectors 
XX 

SQ Sequence 487 BP; 115 A; 118 C; 125 G; 120 T; 0 U; 9 Other; 

Query Match 50.8%; Score 411.4; DB 3; Length 487; 
Best Local Similarity 91.4%; Pred. No. 2.4e-113; 

Matches 445; Conservative 9; Mismatches 10; Indels 23; Gaps 1; 

Qy 164 GCGAGAAAGTGTCGGTCTCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGG 223 

M | | | | | I I I I I I I I I I I I I I I I I I I I II I II II : I I I 1111111:111 : I I I I I I I 

Db 1 GCGAGAAAGTGTCGGTCTCCAAGATGGCGGCCGCMTGGACGTCTGGWCCGAMTGCACCGG 60 

Q y 224 AGGCC GT GAC GGC CAGACT C GT T GGT GT C CT GT G GTT C GT CTC AGT C ACT ACAGGAC C CT 283 

| | | | | | | | M I I I I II : I I I I I I I I I II I I I M M I I II : I II : I I I I I I I II I I I I I I 

Db 61 AAGCC GT GAC GGC CAGAMT C GT T GGT GT C CT GT GGTT C GTMTC ART C ACT ACAGGAC C CT 120 



Qy 



284 GGGGGGCTGTTGCCACCTCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCAAAG 343 
| | | | | M | | I | || I II I I I M I I II I I I I : I I II I I I I II I I II I II M I III I I I I I I I 



Db 


121 


Qy 


344 


Db 


181 


Qy 


381 


Db 


241 


Qy 


441 


Db 


301 


Qy 


501 


Db 


361 


Qy 


561 


Db 


421 


Qy 


621 


Db 


481 



121 GGGGGGCTGTTGCCACCTCCGCCGGGGGCRAGGAGTCGCTTAAGTGCGAGGACCTCAAAG 180 

T GGGACAAT AT AT T T GTAAAGAT CCAAAAATAAAT GA 380 

I I :: I I I I I I I I I I I I I I I I I I I I I I I II I I 

T GRRACAATAT CCTCT GT GGAGAACACCCCCCCAT GGAGGCGAGAT CCAAAAATAAAT GA 240 

C GCT ACGCAAGAAC CAGT TAACT GT ACAAACT ACACAGCT CAT GTTTCCTGTTTTC CAGC 440 

M I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

CGCTACGCAAGAACCAGTTAACTGTACAAACTACACAGCTCATGTTTCCTGTTTTCCAGC 300 

ACCCAAC ATAACT T GTAAGGATT C C AGT GGCAAT GAAACACATTTT ACT GGGAAC GAAGT 500 
I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

AC CCAACATAACT T GTAAGGAT T C CAGT GGCAAT GAAACACAT TT T ACT GGGAAC GAAGT 360 

TGGTTTTTTCAAGCCCATATCTTGCCGAAATGTAAATGGCTATTCCTACAAAGTGGCAGT 560 
I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

TGGTTTTTTCAAGCCCATATCTTGCCGAAATGTAAATGGCTATTCCTACAAAGTGGCAGT 420 

CGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGATTTTACCTTGGATACCCTGC 62 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I I I I I 

CGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGATTTTACCTTGGATACCCTGC 480 



I I I I I I 



Search completed: March 4, 2004, 07:39:01 
Job time : 390 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 
Run on: 



March 4, 2004, 06:47:39 ; Search time 91 Seconds 

(without alignments) 
4939.673 Million cell updates/sec 



Title: US-09-852-100B-1 
Perfect score: 810 

1 atgcatattttaaaagggtc aaacgcaattatatccataa 810 



Sequence : 
Scoring table: 



IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 



1365418 



Searched: 682709 seqs, 277475446 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 

Database : Issued_Patents_NA: * 

1: /cgn2_6/ptodata/2/ina/5A_COMB.seq: * 

2 : /cgn2_6/ptodata/2/ina/5B_COMB.seq:* 

3 : /cgn2_6/ptodata/2/ina/6A_COMB. seq: * 

4 : /cgn2_6/ptodata/2/ina/6B_COMB.seq:* 

5 : /cgn2_6/ptodata/2/ina/PCTUS_COMB. seq: * 

6 : /cgn2_6/ptodata/2/ina/backf ilesl . seq: * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 



Result Query 

No. Score Match Length DB ID 



Description 



1 


499 


-6-1-t 


6" 


2 


499 


61. 


6 


3 


49.8 


6. 


1 


4 


40.4 


5. 


0 


5 


38.6 


4. 


8 


6 


36.8 


4. 


5 


7 


36.4 


4. 


5 


8 


36.4 


4. 


5 


9 


36.4 


4. 


5 


10 


36.2 


4. 


5 


11 


35.4 


4. 


4 



508 — 4 — US- 0 9-4 01- 0 64—2-3-3 

508 4 US-09-401-064-245 

1455 3 US-09-276-531-33 

1119 4 US-09-489-039A-6022 

392000 4 US-10-027-983-11 
4403765 3 US-09-103-84 OA-2 

1462 1 US-08-552-142A-16 

1951 1 US-08-910-973-16 

1951 4 US-09-499-227-16 

8093 4 US-10-204-708-32 

450 4 US-09-252-991A-12127 



-S e qu e n e e— 22 3 -,— App — 
Sequence 245, App 
Sequence 33, Appl 
Sequence 6022, Ap 
Sequence 11, Appl 

Sequence 2, Appli 
Sequence 16, Appl 
Sequence 16, Appl 
Sequence 16, Appl 
Sequence 32, Appl 
Sequence 12127, A 



c 


12 


O C A 

3 b . 4 


4 


. 4 


1 A A A 

14 04 


4 


TTC nO OCO Q Q 1 7\ "lOIQT 

ub— uy-zbz-yy iA-izzy l 


Sequence 


izz y 1 , a 




13 


35 . 4 


4 


. 4 


9347 


4 


ttc *t n on/i o n o o/-~ 
Ub — 1U-ZU4- / Uo-OO 


Sequence 


JO, App± 




14 


O C A 

35 . 4 


4 


. 4 


5800 16 


A 

4 


TTC AO C/IC coon 1 

Ub — Uo-j 4 D— dZ ou~1 


Sequence 


1, Appli 




"1 c 

15 


35.2 


4 


. 3 


a a i i ton 

4411bz9 


6 


ttc no i n o o /i n a t 
Ub~ Uy-lUo — o4 UA— 1 


Sequence 


-L , /\pp±l 


c 


16 


35 


4 


. 3 


1494 


4 


ttc no oco nn 1 t\ o n /i o 

us-uy-z bz-yy ia- /U4y 


Sequence 


"7 n A Q Z\i"» 

/ u 41 y f J\p 




17 


35 


4 


. 3 


4236 


4 


ttc no oco ooitv t Acn 

us-uy-zbz-yyiA- /ob / 


Sequence 


onco 7\t~, 
/ U D I , Ap 




18 


35 


4 


. 3 


7304 


4 


ttc in on/i ono 4*3 

US-10-204- /Ob-43 


Sequence 


4 s), App± 


c 


19 


35 


4 


. 3 


10023 


4 


ttc no oco ooi7\ /"Am 

US-oy-zbz- yy 1A-69 y / 


Sequence 


n n 0 7\-« 

b y y / , Ap 




20 


35 


4 


. 3 


1830121 


4 


ttc no ceo oo/i t 

US- 09-557- 8 8 4-1 


Sequence 


1, Appli 




21 


35 


4 


. 3 


1830121 


4 


US- 0 9- 64 3- 9 9 OA- 1 


Sequence 


1, Appli 




22 


34 . 8 


4 


. 3 


832 


4 


ttc nn /~ 0 -i no/" 00-10 

US- 09- 62 1-97 6-2 8 13 


Sequence 


zoio, Ap 


c 


23 


34 . 6 


4 


. 3 


4673 


1 


US-07-638-431-1 


Sequence 


\ f Appli 


c 


24 


34 . 6 


4 


. 3 


4673 


5 


PCT-US92-00018-1 


Sequence 


1, Appli 




25 


34 


4 


.2 


5152 


4 


US- 10-2 04-7 08-4 7 


Sequence 


47, Appl 




26 


34 


4 


. 2 


11131 


4 


TTn 1 1^1 ono 00 

US-10-204- /Oo-z / 


Sequence 


z / , Appl 




27 


33.6 


4 


. 1 


11049 


4 


US-10-204-708-23 


Sequence 


23, Appl 


c 


28 


33 . 4 


4 


. 1 


549 


4 


ttc nn oco nniTv T/inno 

US-09-252-99 1A-1490 / 


Sequence 


14 yu / , A 




29 


33.4 


4 


. 1 


1125 


4 


US- 0 9-2 52 - 99 IA- 14 72 3 


Sequence 


1 /I O 0 O 71 

14 /z3 r A 




30 


33.4 


4 


. 1 


1636 


6 


5447867-2 


Patent No. 


C /I /I o 0 c o 

b 4 4 / 0 0 / 


c 


31 


33.4 


4 


. 1 


1854 


4 


US- 0 9-2 52 - 99 IA- 15 02 9 


Sequence 


t c n 0 c\ 7\ 

lbUzy f A 




32 


33.4 


4 


. 1 


9179 


4 


US-08-956-171E-100 


Sequence 


■inn t\ — - 

100, App 


c 


33 


33.4 


4 


. 1 


319608 


4 


US-09-539-333D-1 


Sequence 


1, Appli 


c 


34 


33.4 


4 


. 1 


319608 


4 


US- 09- 67 9- 4 09-1 


Sequence 


1, Appli 


c 


35 


33 


4 


. 1 


364 


4 


US- 09- 62 1-97 6-17 2 02 


Sequence 


i o 0 n 0 7\ 
1 /zOz , A 


c 


36 


33 


4 


. 1 


3255 


4 


fin r\ f\ /" ai -inn ~i n 0 

US -09-601-198-108 


Sequence 


lUo, App 




37 


33 


4 


. 1 


4029 


4 


US- 0 9- 62 0-3 12 D-2 01 


Sequence 


0 n 1 7\ 
z 0 1 , App 




38 


33 


4 


. 1 


5152 


4 


US-10-204-708-48 


Sequence 


AO 7\ „ „ "1 

4o, Appl 


c 


39 


33 


4 


. 1 


55298 


4 


US-09-491-356C-1 


Sequence 


1, Appli 


c 


40 


32 . 8 


4 


. 0 


988 


1 


ttc no O^O EL A c. c 

US-08-243-545-5 


Sequence 


C Ti _ _ "1 * 

b, AppJLl 


c 


41 


32.8 


4 


.0 


988 


2 


US-08-993-962-5 


Sequence 


5, Appli 


c 


42 


32.8 


4 


.0 


988 


3 


US-09-160-841-5 


Sequence 


5, Appli 


c 


43 


32.8 


4 


.0 


988 


3 


US-09-109-100-2 


Sequence 


2, Appli 


c 


44 


32.8 


4 


.0 


988 


4 


US-08-669-692-5 


Sequence 


5, Appli 


c 


45 


32.8 


4 


.0 


988 


4 


US-08-444-626-5 


Sequence 


5, Appli 



.ALIGNMENTS 



RESULT 1 

US-09-401-064-233 

; Sequence 233, Application US/09401064 

; Patent No. 6623923 

; GENERAL INFORMATION: 

; APPLICANT: Xu, Jiangchun 

; APPLICANT: Lodes, Michael J. 

-; — APPLICANT : — Seer is t7 Heather 

; APPLICANT: Benson, Darin R. 

; APPLICANT: Meagher, Madeline Joy 

; APPLICANT: Stolk, John A. 

; APPLICANT: Wang, Tongtong 

; TITLE OF INVENTION: COMPOUNDS FOR IMMUNOTHERAPY AND 

; TITLE OF INVENTION: DIAGNOSIS OF COLON CANCER AND METHODS FOR THEIR USE 

; FILE REFERENCE: 210121. 471C2 

; CURRENT APPLICATION NUMBER: US/09/401,064 

; CURRENT FILING DATE: 1999-09-22 

; NUMBER OF SEQ ID NOS : 371 



; SOFTWARE: FastSEQ for Windows Version 3.0 
; SEQ ID NO 233 

LENGTH: 508 

TYPE: DNA 
; ORGANISM: Homo sapien 
US-09-401-064-233 



Query Match 61.6%; Score 499; DB 4; Length 508; 

Best Local Similarity 100.0%; Preci. No. 2.1e-153; 

Matches 499; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 


312 


CGAGGAGT CGCTTAAGT GCGAGGACCTCAAAGT GGGACAATATATTTGTAAAGATCCAAA 


3/1 




I | I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1 


C GAGGAGT CG CT TAAGT GC GAGGAC CT CAAAGT GGGACAAT ATAT T TGTAAAGAT CCAAA 


60 


Qy 


372 


AATAAAT GACGCTACGCAAGAACCAGTTAACTGTACAAACT ACACAGCT CATGTTT CCTG 


431 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 




Db 


61 


AATAAAT GACGCTACGCAAGAACCAGTTAACT GTACAAACT ACACAGCT CAT GTTTCCT G 


120 


Qy 


432 


TTT T C C AGCAC C CAACATAACT TGTAAGGAT T CCAGT GGCAAT GAAAC ACAT TTT ACT GG 


491 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


121 


T TT T C CAGCACCCAACATAACT TGTAAGGATT CCAGT GGCAAT GAAACACATTTT ACT GG 


180 


Qy 


492 


GAAC GAAGTT GGTT T TT T CAAGC C CAT AT CT T GC C GAAAT GT AAAT GGCT AT T C CT ACAA 


551 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


181 


GAACGAAGTTGGTTTTTTCAAGCCCATATCTTGCCGAAATGTAAATGGCTATTCCTACAA 


240 


Qy 


552 


AGT GGCAGT CGCAT TGTCTCTTTTT CTT GGAT GGT T GGGAGCAGAT CGATT T T AC CTT GG 


611 




1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


241 


AGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGATTTTACCTTGG 




Qy 


612 


ATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCT 


671 




I I I I I I M 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 




Db 


301 


ATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCT 


360 


Qy 


672 


AATTGATTT CATT CTTATTT CAAT GCAGATTGTT GGACCTT CAGATGGAAGTAGTTACAT 


731 




I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 




Db 


361 


AATT GATTTCATT CTTATTTCAATGCAGATT GTTGGACCTT CAGATGGAAGTAGTTACAT 


420 


Qy 


732 


T AT AGAT TACT AT GGAAC C AGACT TACAAGACT GAGT AT T ACTAAT GAAAC ATT T AGAAA 


791 




I I I I I I I I I I I I I 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


421 


TATAGATTACTATGGAACCAGACTTACAAGACTGAGTATTACTAATGAAACATTTAGT^AA 


480 


Qy 


792 


AAC GC AAT TAT AT C C AT AA 810 




Db 


481 


1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
AACGCAATTATAT CCATAA 499 





RESULT 2 

US-09-401-064-245 

; Sequence 245, Application US/09401064 

; Patent No. 6623923 

; GENERAL INFORMATION: 

; APPLICANT: Xu, Jiangchun 

; APPLICANT: Lodes, Michael J. 

; APPLICANT: Secrist, Heather 

; APPLICANT: Benson, Darin R. 



APPLICANT: Meagher, Madeline Joy 
; APPLICANT: Stolk, John A. 
; APPLICANT: Wang, Tongtong 

; TITLE OF INVENTION: COMPOUNDS FOR IMMUNOTHERAPY AND 

; TITLE OF INVENTION: DIAGNOSIS OF COLON CANCER AND METHODS FOR THEIR USE 

; FILE REFERENCE: 210121. 471C2 

; CURRENT APPLICATION NUMBER: US/09/401 , 064 

; CURRENT FILING DATE: 1999-09-22 

; NUMBER OF SEQ ID NOS : 371 

; SOFTWARE: FastSEQ for Windows Version 3.0 
; SEQ ID NO 245 

LENGTH: 508 

TYPE: DNA 
; ORGANISM: Homo sapien 
US-09-401-064-245 

Query Match 61.6%; Score 499; DB 4; Length 508; 

Best Local Similarity 100.0%; Pfed. No. 2.1e-153; 

Matches 4 99; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

312 CGAGGAGT C GCT TAAGT GCGAGGACCT CAAAGT GGGACAAT AT AT T T GT AAAGAT C CAAA 371 
| | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I 
1 C GAGGAGT C GCT TAAGT GC GAGGAC CT CAAAGT GGGACAAT AT ATTT GT AAAGAT C CAAA 60 

372 AATAAATGAC GCT AC GCAAGAACC AGT TAACT GT ACAAACT ACACAGCT CAT GT TT C CT G 431 
| M I I I I I I I I I I I I I I I I I I M I M I II I II I I I I I I I I I I I I I I I I I I I I I I I I M I I 
61 AATAAATGACGCTACGCAAGAACCAGTTAACT GT ACAAACTACACAGCT CATGTTT CCTG 120 

432 TTTTCCAGCACCCAACATAACTT GTAAGGATT CCAGTGGCAAT GAAACACATTTTACTGG 491 

| | | | || | | I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I M I I M I I 
121 T TT T C CAGCAC C CAACATAACTT GT AAGGAT T CCAGT GGCAAT GAAAC ACAT TT TACT GG 180 

4 92 GAAC GAAGTT GGT TT TTT CAAGC C CAT AT CTT GC C GAAAT GTAAAT GGCT ATT CCT ACAA 551 

| | | I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

181 GAACGAAGTTGGTTTTTTCAAGCCCATATCTTGCCGAAATGTAAATGGCTATTCCTACAA 240 

552 AGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGATTTTACCTTGG 611 

M | I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I 
241 AGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGATTTTACCTTGG 300 

612 ATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCT 671 

| | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I M 
301 ATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCT 360 

672 AAT T GAT T T CATT CT TAT T T CAAT GCAGAT T GTT GGAC CTT CAGAT GGAAGTAGTT ACAT 731 

M I I I I I I I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
"3 6 1~"AAT T GATTT C ATT CTT AT T T CAAT GCAGAT T GT T GGAC CTT CAGATGGAAGT-AGT-T-ACAT— 42-0- 

732 TAT AGAT T ACT AT GGAAC CAGACTT ACAAGACT GAGT ATT ACTAAT GAAAC AT T T AGAAA 7 91 

| | | | | | I I I I II I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
421 TATAGATTACTATGGAACCAGACTTACAAGACTGAGTATTACTAATGAAACATTTAGAAA 4 80 

Qy 792 AAC GCAAT T AT AT CCAT AA 810 

I I I I I I II I I I I I I I I I I I 
Db 4 81 AAC GCAAT TAT AT CCAT AA 4 99 



QY 
Db 

QY 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 



RESULT 3 

US-09-276-531-33 

Sequence 33, Application US/09276531 
Patent No. 6183968 
GENERAL INFORMATION: 

APPLICANT: Bandman, Olga 
APPLICANT: Lai, Preeti 
APPLICANT : Hillman, Jennifer L. 
APPLICANT: Yue, Henry 
APPLICANT: Reddy, Roopa 
APPLICANT: Guegler, Karl J. 
APPLICANT: Baughn, Marian R. 

TITLE OF INVENTION: COMPOSITION FOR THE DETECTION OF GENES ENCODING 
TITLE OF INVENTION: RECEPTORS AND PROTEINS ASSOCIATED WITH CELL 
PROLIFERATION 

NUMBER OF SEQUENCES : 134 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: INCYTE PHARMACEUTICALS, INC. 
STREET: 3174 PORTER DRIVE 
CITY: PALO ALTO 
STATE: CALIFORNIA 
COUNTRY: USA 
ZIP : 94304 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Word Perfect 6.1 for Windows /MS-DOS 6.2 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 09/27 6, 531 
FILING DATE: Herewith 
CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 60/079,677 
FILING DATE: March 27, 1998 
CLASSIFICATION: 
ATTORNEY/ AGENT INFORMATION: 
NAME: Lynn E. Murry, Ph.D. 
REGISTRATION NUMBER: 42,918 
REFERENCE/ DOCKET NUMBER: PA- 00 08 US 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (650) 855-0555 
TELEFAX: (650) 845-4166 
INFORMATION FOR SEQ ID NO: 33: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 1455 base pairs 
TYPE: nucleic - acid 
STRANDEDNESS : single 
TOPOLOGY: linear 
IMMEDIATE SOURCE: 

LIBRARY: BRAITUT01 
CLONE: 746308 
US-09-276-531-33 



Query Match 6.1%; 
Best Local Similarity 51.1%; 
Matches 117; Conservative 



Score 49.8; DB 3; Length 1455; 
Pred. No. 7.8e-06; 
0; Mismatches 112; Indels 0; Gaps 



0; 



Qy 504 TTTTTTCAAGCCCATATCTTGCCGAAATGTAAATGGCTATTCCTACAAAGTGGCAGTCGC 563 

| I I I I M I I I I I I I I I I I I I I I I I I I I 

Db 544 TTTTCCCAAAATGCTATATTGCAATTGGACTGGAGGCTATAAGTGGTCTACGGCTCTGGC 603 

Q y 564 ATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGATTTTACCTTGGATACCCTGCTTT 623 

| | | I I I I I I I I I I I I I I I I I I I I I I M I I I 

Db 604 TCTAAGCATCACCCTCGGTGGGTTTGGAGCAGACCGTTTCTACCTGGGCCAGTGGCGGGA 663 

Q y 624 GGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCTAATTGATTTCAT 683 

|| | I I I I I I I Ml I I M I I M II N II Ml 

Db 664 AGGCCTCGGCAAGCTCTTCAGCTTCGGTGGCCTGGGAATATGGACGCTGATAGACGTCCT 723 

Qy 684 T CTT ATT T CAAT GCAGATT GT T G GAC CT T CAGAT GGAAGT AGTT AC ATT 732 

| I M I II I I I I I I I I I I I I I I I I I I I I I M 

Db 724 GCTCATTGGAGTTGGCTATGTTGGACCAGCAGATGGCTCTTTGTACATT 772 



RESULT 4 

US-09-4 89-039A-6022 

Sequence 6022, Application US/09489039A 
Patent No. 6610836 
GENERAL INFORMATION: 
APPLICANT: Gary Breton et. al 

TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO 
KLEBSIELLA 

TITLE OF INVENTION: PNEUMONIAE FOR DIAGNOSTICS AND THERAPEUTICS 
FILE REFERENCE: 2709.2004001 
CURRENT APPLICATION NUMBER: US/09/489, 039A 
CURRENT FILING DATE: 2000-01-27 
PRIOR APPLICATION NUMBER: US 60/117,747 
PRIOR FILING DATE: 1999-01-29 
NUMBER OF SEQ ID NOS : 14342 
SEQ ID NO 6022 
LENGTH: 1119 
TYPE: DNA 

ORGANISM: Klebsiella pneumoniae 
US-09-489-039A-6022 

Query Match 5.0%; Score 40.4; DB 4; Length 1119; 

Best Local Similarity 50.0%; Pred. No. 0.0079; 

Matches 101; Conservative 0; Mismatches 101; Indels 0; Gaps 0; 

Qy 105 GAACCTCGCCCTGTTGCCCTTCTCCCTCCCGCTCCTGGGCGGAGGCGGAAGCGGAAGTGG 164 

I M I III M I I I I I I I I I I III I 

D b 672 GAACAGC GGC GGT TTCCTCGGC GC GC C GAC GC C CCT GAACAAC GGC GT GCT GGAGAGT AG 731 

Qy 165 CGAGAAAGTGTCGGTCTCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGA 224 

Ml I I I I I I II I I I I I I I I I I M I M I I I 

Db 732 CGAACCAGCGGCAGCCGCCGCGACGGCTCCTGCCGCCGGCGCCACGCCAACAGCGCCAGT 791 

Qy 22 5 GGCCGTGACGGCCAGACTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTG 284 

| | | | | I I I I III I I I I I I I I I I I I I I I 

Db 792 GACCGCGCCTGGCTCCATTCAGGGTAATGTGGCGCCCGCTGCGGCCACCGCCGCAGCCGC 851 



Qy 



285 GGGGGCTGTTGCCACCTCCGCC 306 
I I I I I I I I MM M 



Db 852 TGGCGCCGTGGCGGCCTCGTCC 873 



RESULT 5 

US-10-027-983-11 

; Sequence 11, Application US/10027983 

; Patent No. 6617162 

; GENERAL INFORMATION: 

; APPLICANT: Kenneth W. Dobie 

; APPLICANT: Mark P. Roach 

; TITLE OF INVENTION: ANTISENSE MODULATION OF ESTROGEN RECEPTOR ALPHA 
EXPRESSION 

; FILE REFERENCE: RTS-0340 

; CURRENT APPLICATION NUMBER: US/10/027 , 983 
; CURRENT FILING DATE: 2001-12-18 
; NUMBER OF SEQ ID NOS : 98 
; SEQ ID NO 11 

LENGTH: 392000 
; TYPE: DNA 

ORGANISM: Homo sapiens 

FEATURE: 
; NAME/KEY: unsure 

LOCATION: 137740 

OTHER INFORMATION: unknown 
; NAME/ KEY: unsure 

LOCATION: 137742 
; OTHER INFORMATION: unknown 
; NAME/KEY: misc_f eature 

LOCATION: ( 138 122 )...( 13822 1 ) 
; OTHER INFORMATION: n = A,T,C or G 

NAME/KEY: unsure 

LOCATION: 145507 
; OTHER INFORMATION: unknown 

NAME/KEY: unsure 
; LOCATION: 151967 
; OTHER INFORMATION: unknown 
; NAME/KEY: mi sc_f eature 

LOCATION: ( 151967 )...( 1542066) 

OTHER INFORMATION: n = A,T,C or G 
; NAME /KEY: unsure 

LOCATION: 154217 

OTHER INFORMATION: unknown 
; NAME/ KEY: mi sc_f eature 

LOCATION: (164037 )...{ 164136) 
; OTHER INFORMATION: n = A,T,C or G 
; NAME/KEY: misc_feature 
-; LOCATION:— (1-7-4 6 5-7)^v^(-l-7 4-7 5 6) 

OTHER INFORMATION: n = A,T,C or G 

NAME/ KEY: mi sc_f eature 

LOCATION: ( 186224 )...( 186323) 

OTHER INFORMATION: n = A,T,C or G 
; NAME/KEY: misc_feature 

LOCATION: ( 195242 )...( 195341 ) 
; OTHER INFORMATION: n = A,T,C or G 

NAME/ KEY: unsure 
LOCATION: 202703 
OTHER INFORMATION: unknown 



; NAME/ KEY: misc_f eature 

; LOCATION: (202771) ... (202870) 

; OTHER INFORMATION: n = A,T,C or G 

r NAME/ KEY: mi sc_f eature 

; LOCATION: (206246) ... (215602) 

: OTHER INFORMATION; n = A,T,C or G 

; NAME/ KEY: mi sc_f eature 

; LOCATION: (218126) ... (218225) 

; OTHER INFORMATION: n = A,T,C or G 

; NAME/ KEY : misc_f eature 

; LOCATION: (220360) ... (220459) 

; OTHER INFORMATION: n = A,T,C or G 

; NAME/ KEY: mis cofeature 

LOCATION: (222717) ... (222816) 

OTHER INFORMATION : n = A,T,C or G 

NAME/ KEY: mi sc_f eature 
; LOCATION: (223981) ... (224080) 

OTHER INFORMATION: n = A,T,C or G 

NAME/ KEY: mis cofeature 

LOCATION: (227487) ... (227586) 

OTHER INFORMATION: n = A,T,C or G 

NAME/ KEY: mi sc_f eature 

LOCATION: ( 230157 )...( 230256) 

OTHER INFORMATION: n = A,T,C or G 
; NAME/KEY: misc_feature 

LOCATION: ( 2322 99 )...( 232398 ) 

OTHER INFORMATION: n = A,T,C or G 
; NAME /KEY: misc_feature 

LOCATION: (236552 )... (2366651) 

OTHER INFORMATION: n = A,T,C or G 
; NAME /KEY: misc_feature 

LOCATION: (238789 )... (248788 ) 
; OTHER INFORMATION: n = A,T,C or G 
; NAME/ KEY: exon 

LOCATION: ( 118288 )...( 119101 ) 
; OTHER INFORMATION: exon 1C 

NAME/KEY: exon:intron junction 
; LOCATION: ( 151129 )...( 151130 ) 
; OTHER INFORMATION: exon 5:intron 5 

NAME/KEY: exon:intron junction 
; LOCATION: (299248 )... (299249) 

OTHER INFORMATION: exon 9:intron 9 
; NAME/ KEY : exon:intron junction 

LOCATION: ( 34 857 8 ) . . . ( 34 8579 ) 

OTHER INFORMATION: exon 10:intron 10 
; NAME/KEY: intron 

~; LOCATION:— ("3 4 8 5-79 )^v^(-3 81-8-3 8) 

; OTHER INFORMATION: intron 10 

NAME/KEY: intron: exon junction 
; LOCATION: (386185) ...( 386186) 

OTHER INFORMATION: intron 11: exon 12 
US-10-027-983-11 

Query Match 4.8%; Score 38.6; DB 4; Length 392000; 

Best Local Similarity 54.6%; Pred. No. 1.5; 

Matches 77; Conservative 0; Mismatches 64; Indels 0; Gaps 



Qy 237 CAGACTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCTGTTGC 296 

I I I I I I I I I I I I I I I I I I I I I I I I I I M I M I 

Db 127492 CAAATTTGGCTGCGTGCAGCTGCTCATGCCTGTCATCCCAGCACTTTGAGGAACTGAAGG 

127551 

Qy 297 CACCTCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCAAAGTGGGACAATATAT 356 

I II I I I I I I I I I I I II III MINIMI III 

Db 127552 GAGGAT T GCTT GAGT C CAGGAGTT CCAGAC CAGCCT GGGCAACACAGT GAGACC CT GT CT 

127611 

Qy 357 T T GT AAAG AT C CAAAAAT AAA 377 

I I I I I I I I I I I I I I 
Db 127612 CTACAAAAAAACAAAAACAAA 127632 



RESULT 6 

US-09-103-840A-2 

Sequence 2, Application US/09103840A 
Patent No. 6294328 
GENERAL INFORMATION: 
APPLICANT: FLEISCHMAN, Robert D. 
APPLICANT: WHITE, Owen R. 
APPLICANT: FRASER, Claire M. 
APPLICANT: VENTER, John C. 

TITLE OF INVENTION: DNA SEQUENCES FOR STRAIN ANALYSIS IN MYCOBACTERIUM 
TITLE OF INVENTION: TUBERCULOSIS 
FILE REFERENCE: 24366-20007.00 
CURRENT APPLICATION NUMBER: US/09/103, 840A 
CURRENT FILING DATE: 1998-06-24 
NUMBER OF SEQ ID NOS : 2 
SOFTWARE: Patent In Ver. 2.1 
SEQ ID NO 2 

LENGTH: 4403765 
TYPE: DNA 

ORGANISM: Mycobacterium tuberculosis 
FEATURE: 

OTHER INFORMATION: CDC 1551 

OTHER INFORMATION: "n" bases at various positions throughout the sequence 
OTHER INFORMATION: represent a, t, c or g 
US-09-103-840A-2 

Query Match 4.5%; Score 36.8; DB 3; Length 4403765; 

Best Local Similarity 47.8%; Pred. No. 25; 

Matches 107; Conservative 0; Mismatches 117; Indels 0; Gaps 0; 
Qy H2 GCCCTGTTGCCCTTCTCCCTCCCGCTCCTGGGCGGAGGCGGAAGCGGAAGTGGCGAGAAA 171 

I-I-I — 1^— l-M "I — I — li-l 1 — II — l-H l-H — I — i-l-tl — — I 



Db 676161 GCCGGGCGCGCCGTTCGCGCCATGCGCGCTGCCGCCGACGCTGGCGCCACCGGCGCCACC 

676220 

Qy 172 GTGTCGGTCTCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGAGGCCGTG 231 

I I I I II I I I I I I I I II I I I II I I I I I III 

Db 676221 GGCCCCACCGGCGCCCGGGTTGCCGCCATTGCCACCGGTCCCGCCGGCACGAAGGTTGTG 

676280 

Qy 232 ACGGCCAGACTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCT 291 

I I I I I II I I I I I I I I I I I I II I I I I I 



Db 67 6281 ACCCCACGTCCCGGTAGCGCCGTTGCCGCCGTCACCGGGAGCTCCGCCGTCACCGCCGCT 

676340 



Qy 292 GTTGCCACCTCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGA 335 

I I I I I I I I I I I I I I I I I I M I I I I I 

Db 67 6341 ACCGCCAGCCCCGCCGGCGCCGTGGCTGCCGCCGAGGCCGAGCA 676384 



RESULT 7 

US-08-552-142A-16 

Sequence 16, Application US/08552142A 
Patent No. 5695995 
GENERAL INFORMATION: 

APPLICANT: Weintraub, Harold M. 
APPLICANT: Lee, Jacqueline E. 
APPLICANT: Tapscott, Stephen J. 
APPLICANT: Hollenberg, Stanley M. 

TITLE OF INVENTION: Neurogenic Differentiation (NeuroD) Genes 
TITLE OF INVENTION: and Proteins 
NUMBER OF SEQUENCES: 20 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Christensen O'Connor Johnson KindnessPLLC 
STREET: 1420 Fifth Avenue, Suite 2800 
CITY: Seattle 
STATE: WA 
COUNTRY: USA 
ZIP: 98101-2347 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 08/ 552 , 142A 
FILING DATE: 02-NOV-1995 
CLASSIFICATION: 514 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/239,238 
FILING DATE: 06-MAY-1994 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: WO PCT/US95/ 05741 
FILING DATE: 08-MAY-1995 
ATTORNEY/AGENT INFORMATION: 
NAME: Broderick, Thomas F. 
REGISTRATION NUMBER: 31,332 
REFERENCE/ DOCKET NUMBER: FHCR-1-8933 

TELECOMMUNTCATTON— INFORMATION-: 

TELEPHONE: 206-682-8100 
TELEFAX: 206-225-07 09 



INFORMATION FOR SEQ ID NO: 16: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 1462 base pairs 
TYPE: nucleic acid 
STRANDEDNESS : single 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
ORIGINAL SOURCE: 



ORGANISM: Mus mus cuius 
IMMEDIATE SOURCE: 

CLONE : 1.1.1 
FEATURE: 

NAME/ KEY: CDS 
LOCATION: 231. . 1101 
US-08-552-142A-16 

Query Match 4.5%; Score 36.4; DB 1; Length 14 62; 

Best Local Similarity 49.0%; Pred. No. 0.19; 

Matches 97; Conservative 0; Mismatches 101; Indels 0; Gaps 0; 

Qy 121 CCCTTCTCCCTCCCGCTCCTGGGCGGAGGCGGAAGCGGAAGTGGCGAGAAAGTGTCGGTC 18 0 

| | I I I I I I I I I I I 

D b 261 CTCCTCTCGGACGTGCCCAAGTTCGCCAGCTGGGGCGACGGCGACGACGACGAGCCGAGG 320 

Qy 181 TCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGAGGCCGTGACGGCCAGA 24 0 

I I I I I I I I M I I I I I I I I I I I I M 

Db 321 AGCGACAAGGGCGACGCGCCGCCGCAGCCTTCTCCTGCTCCCGGGTCGGGGGCTCCAGGA 380 

Qy 241 CTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCTGTTGCCACC 300 

Ml III I I I II I I I I I I I Ml I I I I 

D b 381 CCCGCCCGGGCCGCCAAGCCAGTGTCTCTTCGTGGAGGAGAAGAGATCCCTGAACCCACG 44 0 

Qy 301 TCCGCCGGGGGCGAGGAG 318 

I I I I I I I I I I I I 
Db 441 TTGGCTGAGGTCAAGGAG 458 



RESULT 8 

US-08-910-973-16 

; Sequence 16, Application US/08910973 
; Patent No. 5795723 
; GENERAL INFORMATION: 

APPLICANT: Tapscott, Stephen J. 
; APPLICANT: Olson, James M. 

; TITLE OF INVENTION: Expression of Neurogenic bHLH Genes in Primitive 
Neuroectoder 

; NUMBER OF SEQUENCES: 24 

CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Christensen O'Connor Johnson KindnessPLLC 

; STREET: 1420 Fifth Avenue, Suite 2800 

CITY: Seattle 
; STATE : WA 

COUNTRY: USA 
; ZIP: 98101-2347 
~; COMPUTER - READABLE - FORM: 

MEDIUM TYPE: Floppy disk 
; COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
; CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/910,973 
; FILING DATE: 

CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/239,238 



FILING DATE: 06-MAY-1994 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: WO PCT/US95/05741 
; FILING DATE: 08-MAY-1995 

PRIOR APPLICATION DATA: 
; APPLICATION NUMBER: PCT/US96/17532 

FILING DATE: 30-October-1996 
; ATTORNEY/ AGENT INFORMATION: 
; NAME: Sheiness, Diana K. 

REGISTRATION NUMBER: 35,356 
REFERENCE/ DOCKET NUMBER: FHCR- 1-1 0958 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: 206-682-8100; 206-224-0735 (direct) 
; TELEFAX: 206-225-0779 

; INFORMATION FOR SEQ ID NO: 16: 
; SEQUENCE CHARACTERISTICS: 

LENGTH: 1951 base pairs 
; TYPE: nucleic acid 

STRANDEDNESS: single 
; TOPOLOGY: linear 

MOLECULE TYPE: cDNA 
; ORIGINAL SOURCE: 
; ORGANISM: Mus mus cuius 

IMMEDIATE SOURCE: 
; CLONE: 1.1.1 (mouse neuroD2) 

FEATURE: 

NAME/ KEY: CDS 
LOCATION: 230.. 1378 
US-08-910-973-16 

Query Match 4.5%; Score 36.4; DB 1; Length 1951; 

Best Local Similarity 49.0%; Pred. No. 0.23; 

Matches 97; Conservative 0; Mismatches 101; Indels 0; Gaps 0 

Qy 121 CCCTTCTCCCTCCCGCTCCTGGGCGGAGGCGGAAGCGGAAGTGGCGAGAAAGTGTCGGTC 180 

I I I I I I 1 III ! II III III I I I I I I I I I I 
Db 260 CTCCTCTCGGACGTGCCCAAGTTCGCCAGCTGGGGCGACGGCGACGACGACGAGCCGAGG 319 

Qy 181 TCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGAGGCCGTGACGGCCAGA 24 0 

I I I I I I I I I I I I I II I I I I I M I I I I I I I II 

Db 320 AGCGACAAGGGCGACGCGCCGCCGCAGCCTTCTCCTGCTCCCGGGTCGGGGGCTCCAGGA 37 9 

Qy 241 CTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCTGTTGCCACC 300 

III I I I I I I II I I I I I I I III I I I I 

Db 380 CCCGCCCGGGCCGCCAAGCCAGTGTCTCTTCGTGGAGGAGAAGAGATCCCTGAACCCACG 43 9 

Qy ~3"01 _ TCCGCCGGGGGCGAGGAG-31-8 

I I I I II I I I I I I 
Db 440 TTGGCTGAGGTCAAGGAG 457 



RESULT 9 

US-09-499-227-16 

; Sequence 16, Application US/09499227 
; Patent No. 6444463 
; GENERAL INFORMATION: 

APPLICANT: Tapscott, Stephen J. 



; APPLICANT: Olson, James M. 

; TITLE OF INVENTION: Expression of Neurogenic bHLH Genes in Primitive 
Neuroectoder 

NUMBER OF SEQUENCES : 24 
; CORRESPONDENCE ADDRESS: 

; ADDRESSEE: Christensen O'Connor Johnson KindnessPLLC 

; STREET: 1420 Fifth Avenue, Suite 2800 

CITY: Seattle 

STATE: WA 

COUNTRY: USA 

ZIP: 98101-2347 
; COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
; COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 
; SOFTWARE: Patentln Release #1.0, Version #1.25 

; CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/499, 227 
; FILING DATE: 05-August-1998 

; PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/239,238 
; FILING DATE: 06-May-1994 

PRIOR APPLICATION DATA: 

APPLICATION NUMBER: WO PCT/US95/05741 
; FILING DATE: 08-May-1995 

PRIOR APPLICATION DATA: 
; APPLICATION NUMBER: PCT/US96/17532 

FILING DATE: 30-October-1996 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/910,973 

FILING DATE: 07-August-1997 
; ATTORNEY/AGENT INFORMATION: 

NAME: Sheiness, Diana K. 

REGISTRATION NUMBER: 35,356 

REFERENCE/ DOCKET NUMBER: FHCR-1-12742 
TELECOMMUNICATION INFORMATION: 
; TELEPHONE: 206-682-8100; 206-224-0735 (direct) 

TELEFAX: 206-225-0779 
; INFORMATION FOR SEQ ID NO: 16: 

SEQUENCE CHARACTERISTICS: 
; LENGTH: 1951 base pairs 

; TYPE: nucleic acid 

STRANDEDNESS: single 

TOPOLOGY: linear 
; MOLECULE TYPE: cDNA 

ORIGINAL SOURCE: 

; ORGANISM: Mus~mus cuius — 

; IMMEDIATE SOURCE: 

; CLONE: 1.1.1 (mouse neuroD2) 

; FEATURE: 

NAME/ KEY: CDS 
; LOCATION: 230.. 1378 

US-09-499-227-16 



Query Match 4.5%; Score 36.4; DB 4; Length 1951; 

Best Local Similarity 49.0%; Pred. No. 0.23; 

Matches 97; Conservative 0; Mismatches 101; Indels 0; Gaps 



0; 



Qy 121 CCCTTCTCCCTCCCGCTCCTGGGCGGAGGCGGAAGCGGAAGTGGCGAGAAAGTGTCGGTC 180 

I I I I I I I III I II III IN I I I I I I I I I I 

D b 260 CTCCTCTCGGACGTGCCCAAGTTCGCCAGCTGGGGCGACGGCGACGACGACGAGCCGAGG 319 

Qy 181 TCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGAGGCCGTGACGGCCAGA 240 

Mill MM I I I I M I I II I I I I I I I I I I II 

Db 320 AGCGACAAGGGCGACGCGCCGCCGCAGCCTTCTCCTGCTCCCGGGTCGGGGGCTCCAGGA 379 

Qy 241 CTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCTGTTGCCACC 300 

Ml Ml I I I I I I I I I I I I Ml I I I I 

Db 38 0 C C C GC C CGGGC CGCCAAGC CAGT GT CT CT T CGT GGAGGAGAAGAGAT CC CT GAACCC ACG 439 

Qy 301 TCCGCCGGGGGCGAGGAG 318 

I I I I I I I I I II I 
Db 44 0 TTGGCTGAGGTCAAGGAG 457 



RESULT 10 
US-10-204-708-32 

; Sequence 32, Application US/10204708 

; Patent No. 6677731 

; GENERAL INFORMATION: 

; APPLICANT: OLEK, Alexander 

; APPLICANT: PIEPENBROCK, Christian 

; APPLICANT: BERLIN, Kurt 

TITLE OF INVENTION: Diagnosis of Diseases Associated with DNA Replication 
; TITLE OF INVENTION: by Assessing DNA Methylation 
; FILE REFERENCE: 5013.1012 

; CURRENT APPLICATION NUMBER: US/10/204, 708 

; CURRENT FILING DATE: 2003-05-06 

; PRIOR APPLICATION NUMBER: PCT/EP01/ 0397 1 

; PRIOR FILING DATE: 2001-04-06 

; PRIOR APPLICATION NUMBER: DE 10019058.8 

; PRIOR FILING DATE: 2000-04-06 

; PRIOR APPLICATION NUMBER: DE 10019173.8 

; PRIOR FILING DATE: 2000-04-07 

; PRIOR APPLICATION NUMBER: DE 10032529.7 

; PRIOR FILING DATE: 2000-06-30 

; PRIOR APPLICATION NUMBER: DE 10043826.1 

; PRIOR FILING DATE: 2000-09-01 

; NUMBER OF SEQ ID NOS : 98 

; SEQ ID NO 32 

; LENGTH: 8 093 

; TYPE: DNA 

ORGANISM: Artificial Sequence 
~; FEATURE! ; 

OTHER INFORMATION: chemically treated genomic DNA (Homo sapiens) 

US-10-204-708-32 

Query Match 4.5%; Score 36.2; DB 4; Length 8093; 

Best Local Similarity 54.0%; Pred. No. 0.69; 

Matches 74; Conservative 0; Mismatches 63; Indels 0; Gaps 0; 

Q y 673 ATT GATTT CATTCTT ATTT CAAT GCAGATTGTT GGACCTTCAGATGGAAGTAGTTACATT 732 

| | | M I I II I I I I I I I I I I II I I I I I I Ml III 

Db 6468 AGTAAT TT T GT AT T T TT AT T AAGAT AAT T T GT T TT GT GT TAAAAT AGT AAT T T TT AAATT 6527 



Qy 733 ATAGAT T ACT AT GGAAC C AGACT T ACAAGACT GAGT ATT ACTAAT GAAACATTT AGAAAA 792 

| | | | | I | I I II III II I I I I I I I I I I M II I 

Db 6528 T TT GT T T ATT AT GAAAAGGTAAT TT TAAAGT T TAT TAT GTAAAATT AAT TATAAATAGGA 6587 

Qy 7 93 ACGCAATTATATCCATA 809 

I I I I II I Ml 
Db 6588 T T T AAT T TAT AT T TATA 6604 



RESULT 11 

US-09-252-991A-12127 

Sequence 12127, Application US/09252991A 
Patent No. 6551795 
GENERAL INFORMATION: 
APPLICANT: Marc J. Rubenfield et al . 

TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO 
PSEUDOMONAS 

TITLE OF INVENTION: AERUGINOSA FOR DIAGNOSTICS AND THERAPEUTICS 
FILE REFERENCE: 107196.136 

CURRENT APPLICATION NUMBER: US/ 09/252 , 991A 
CURRENT FILING DATE: 1999-02-18 
PRIOR APPLICATION NUMBER: US 60/074,788 
PRIOR FILING DATE: 1998-02-18 
PRIOR APPLICATION NUMBER: US 60/094,190 
PRIOR FILING DATE: 1998-07-27 
NUMBER OF SEQ ID NOS : 33142 
SEQ ID NO 12127 
LENGTH: 450 
TYPE: DNA 

ORGANISM: Pseudomonas aeruginosa 
US-09-252-991A-12127 

Query Match 4.4%; Score 35.4; DB 4; Length 450; 

Best Local Similarity 48.3%; Pred. No. 0.19; 

Matches 99; Conservative 0; Mismatches 106; Indels 0; Gaps 0; 

Q y 60 G C GAAGAGAC GGAACT GG C CT CT AT C CT AT GC GAGGT CC CTTTAAGAAC CT C GC CCT GT T 119 

I M I I I I I I Ml I I I I I I M I I I I I 

Db 45 GCGCAGCCAGGTCACCACCCTGACCCGCCGCCTGCGTCGCGAGGCGCAGGCCGATCCGGT 104 

Qy 12 0 GCCCTTCTCCCTCCCGCTCCTGGGCGGAGGCGGAAGCGGAAGTGGCGAGAAAGTGTCGGT 17 9 

|| I I I I I I I I I I I I III II I I I I I I N I 

Db 105 GCAATTCTCGCAACTGGTCGTGCTTGGCGCGATCGACCGCCTTGGCGGCGACGTCACACC 164 

Qy 180 CTCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGAGGCCGTGACGGCCAG 239 

I — I— I — I— I— l-l — I— 1— l-l-l— I — I I — I — 1 — I I I I H-l-l— l-l — H- ' ' 



D b 165 TTCCGAGCTGGCCGCCGCCGAGCGGATGCGCTCGTCGAATCTGGCCGCGCTGCTGCGCGA 224 

Qy 24 0 ACTCGTTGGTGTCCTGTGGTTCGTC 264 

III I I I II HIM 

Db 225 ACT GGAAC GCGGAGGGCT GAT CGTC 249 



RESULT 12 

US-09-252-991A-12291/c 

; Sequence 12291, Application US/09252991A 



Patent No. 6551795 
GENERAL INFORMATION : 
APPLICANT: Marc J. Rubenfield et al . 

TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO 
PSEUDOMONAS 

TITLE OF INVENTION: AERUGINOSA FOR DIAGNOSTICS AND THERAPEUTICS 
FILE REFERENCE: 107196.136 

CURRENT APPLICATION NUMBER: US/09/252 , 991A 
CURRENT FILING DATE: 1999-02-18 
PRIOR APPLICATION NUMBER: US 60/074,788 
PRIOR FILING DATE: 1998-02-18 
PRIOR APPLICATION NUMBER: US 60/094,190 
PRIOR FILING DATE: 1998-07-27 
NUMBER OF SEQ ID NOS : 33142 
SEQ ID NO 12291 
LENGTH: 1404 
TYPE: DNA 

ORGANISM: Pseudomonas aeruginosa 
US-09-252-991A-12291 

Query Match 4.4%; Score 35.4; DB 4; Length 1404; 

Best Local Similarity 48.3%; Pred. No. 0.4; 

Matches 99; Conservative 0; Mismatches 106; Indels 0; Gaps 0; 

Qy 60 GCGAAGAGACGGAACTGGCCTCTATCCTATGCGAGGTCCCTTTAAGAACCTCGCCCTGTT 119 

I I I I I I I I I III I I I I I I M I I I I I 

Db 1324 GCGCAGCCAGGTCACCACCCTGACCCGCCGCCTGCGTCGCGAGGCGCAGGCCGATCCGGT 12 65 

Qy 120 GCCCTTCTCCCTCCCGCTCCTGGGCGGAGGCGGAAGCGGAAGTGGCGAGAAAGTGTCGGT 179 

II I I I I I I I I I I II III M Mill III I 

Db 1264 GCAATTCTCGCAACTGGTCGTGCTTGGCGCGATCGACCGCCTTGGCGGCGACGTCACACC 1205 

Qy 180 CTCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGAGGCCGTGACGGCCAG 239 

Ml II I I I I I I N N I I 

Db 12 04 TTCCGAGCTGGCCGCCGCCGAGCGGATGCGCTCGTCGAATCTGGCCGCGCTGCTGCGCGA 1145 

Qy 240 ACTCGTTGGTGTCCTGTGGTTCGTC 264 

I II I II I I I II I I 

Db 1144 ACT GGAACGCGGAGGGCT GAT CGT C 1120 



RESULT 13 
US-10-204-708-36 

; Sequence 36, Application US/10204708 
; Patent No. 6677731 
; GENERAL INFORMATION: 

"T^ - APPLICANT : — OLEK7 Alexander 

APPLICANT: PIEPENBROCK, Christian 
; APPLICANT: BERLIN, Kurt 

TITLE OF INVENTION: Diagnosis of Diseases Associated with DNA Replication 

TITLE OF INVENTION: by Assessing DNA Methylation 
; FILE REFERENCE: 5013.1012 

; CURRENT APPLICATION NUMBER: US/10/204,708 

; CURRENT FILING DATE: 2003-05-06 

; PRIOR APPLICATION NUMBER: PCT/EP01/ 03971 

; PRIOR FILING DATE: 2001-04-06 

; PRIOR APPLICATION NUMBER: DE 10019058.8 



PRIOR FILING DATE: 2000-04-06 
PRIOR APPLICATION NUMBER: DE 10019173.8 
PRIOR FILING DATE: 2000-04-07 
PRIOR APPLICATION NUMBER: DE 10032529.7 
PRIOR FILING DATE: 2000-06-30 
PRIOR APPLICATION NUMBER: DE 10043826.1 
PRIOR FILING DATE: 2000-09-01 
NUMBER OF SEQ ID NOS : 98 
SEQ ID NO 36 
LENGTH: 9347 
TYPE: DNA 

ORGANISM: Artificial Sequence 
FEATURE : 

OTHER INFORMATION: chemically treated genomic DNA (Homo sapiens) 
US-10-204-708-36 

Query Match 4.4%; Score 35.4; DB 4; Length 9347; 

Best Local Similarity 45.7%; Pred. No. 1.4; 

Matches 123; Conservative 0; Mismatches 146; Indels 0; Gaps 0; 

AATGGCTATTCCTACAAAGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCA 594 

| | | | || I I I I I I I I I I I I II I I I M M I 

AAT AGT AGT T GAGAT AAAAT AGAAGT T TAT TTTTTTCGT T AGAT AAAAT AAT T T T AGAGA 7307 

GATCGATTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTT 654 

I Mill I II I I I I I II I I I II I I I I I M 

TAGGTAGTTTAGGGTTAAAATGGTGGTTTTACGTTTATTAGGTTTTTATCGTTTTATTTT 7 367 

T GT GGAAT T GGGAGCCT AATT GAT T T CATT CTT ATTT CAAT GC AGATT GTT GGAC CT T CA 714 

M || I II I II I I I I I I I I I I I I M II 

AT T GTT T T C GAATT AAT T T TT AT T T T T ATAGTT ATTTT AT T GT TTAAGAT GGTT T T T GGA 7427 

GAT GGAAGTAGTTACAT TAT AGAT TACT AT GGAACCAGACTTACAAGACT GAGTATT ACT 774 

I I I I I I I I I I I I I I I I I IN Ml I 

ATTTTAGTTATTAGATTTATATTTTATTTTAGGTAGCGGAAAGTAGGAAGAAGAAAGGGT 7487 

AATGAAACATTTAGAAAAACGCAATTATA 803 
II III I I I I I I I I I I I I 



Qy 


535 


Db 


7248 


Qy 


595 


Db 


7308 


Qy 


655 


Db 


7368 


Qy 


715 


Db 


7428 


Qy 


775 


Db 


7488 



RESULT 14 
US-08-545-528D-1 

; Sequence 1, Application US/08545528D 
; Patent No. 6537773 
; GENERAL INFORMATION: 

-; APPLICANT : Fraser - et~ al^. ; — 

; TITLE OF INVENTION: Nucleotide Sequence of the Mycoplasma Genitalium Genome, 

Fragments 

; Patent No. 6537773 

TITLE OF INVENTION: Thereof, and Uses Thereof 
; FILE REFERENCE: PB193P1 

; CURRENT APPLICATION NUMBER: US/08/545, 528D 

; CURRENT FILING DATE: 1995-10-19 

; PRIOR APPLICATION NUMBER: US 08/488,018 

PRIOR FILING DATE: 1995-06-07 
; PRIOR APPLICATION NUMBER: US 08/473,545 



; PRIOR FILING DATE: 1995-06-07 
; NUMBER OF SEQ ID NOS; 1 
; SOFTWARE: Patentln version 3.1 
; SEQ ID NO 1 

LENGTH: 580073 

TYPE: DNA 
; ORGANISM: Mycoplasma genitalium 
US-08-545-528D-1 

Query Match 4.4%; Score 35.4; DB 4 ; Length 580073; 

Best Local Similarity 55.2%; PrecL No. 21; 

Matches 69; Conservative 0; Mismatches 56; Indels 0; Gaps 0 

Qy 685 CTTATTTCAAT GCAGATT GTTGGACCTTCAGAT GGAAGTAGTTACATTATAGATTACTAT 744 

I I I I I I I I I I I I I I III II I II II I I I I 

Db 387060 CT T TT TTAAAGT TAAAT T GTCCT GCT T T GT GAAT CCAT GGTAAGCAGGAT GGAAT TAT T A 

387119 

Qy 745 GGAACCAGACTTACAAGACTGAGTATTACTAATGAAACATTTAGAAAAACGCAATTATAT 804 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 387120 AGT AC CT T C CAT CC AAG GCT TAT TAT GAAT CAT T AAAC AAT AAGC AAAT T CAAT T T AAAG 

387179 

Qy 805 CCAT A 809 

(MM 

Db 387180 CCAT A 387184 



RESULT 15 
US-09-103-840A-1 

; Sequence 1, Application US/09103840A 

; Patent No. 6294328 

; GENERAL INFORMATION: 

; APPLICANT: FLEISCHMAN, Robert D. 

; APPLICANT: WHITE, Owen R. 

; APPLICANT: FRASER, Claire M. 

; APPLICANT: VENTER, John C. 

; TITLE OF INVENTION: DNA SEQUENCES FOR STRAIN ANALYSIS IN MYCOBACTERIUM 

; TITLE OF INVENTION: TUBERCULOSIS 

; FILE REFERENCE: 24366-20007.00 

; CURRENT APPLICATION NUMBER: US/09/103, 840A 

; CURRENT FILING DATE: 1998-06-24 

; NUMBER OF SEQ ID NOS: 2 

; SOFTWARE: Patentln Ver. 2.1 

; SEQ ID NO 1 

LENGTH: 4411529 

; TYPE! DNA 

; ORGANISM: Mycobacterium tuberculosis 

OTHER INFORMATION: H37Rv 
US-09-103-840A-1 

Query Match 4.3%; Score 35.2; DB 3; Length 4411529; 

Best Local Similarity 47.3%; Pred. No. 66; 

Matches 106; Conservative 0; Mismatches 118; Indels 0; Gaps 0 

Qy 112 GCCCTGTTGCCCTTCTCCCTCCCGCTCCTGGGCGGAGGCGGAAGCGGAAGTGGCGAGAAA 171 

I I I I Ml I I III III Ml Ml I MM I 



Db 674718 GCCGGGCGCGCCGTTCGCGCCATGCGCGCTGCCGCCGACGCTGGCGCCACCGGCGCCACC 

674777 

Qy 172 GTGTCGGTCTCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGAGGCCGTG 231 

I I I I II II II I I I I I I I I I II I II I III 

Db 674778 GGCCCCACCGGCGCCCGGGTTGCCGCCATTGCCACCGGTCCCGCCGGCACCAAGGTTGTG 

674837 

Qy 232 ACGGCCAGACTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCT 291 

I I I I I I I I I I I I I II I I I I I I I I I I I 

Db 674838 ACCCCACGTCCCGGTAGCGCCGTTGCCGCCGTCACCGGGAGCTCCGCCGTCACCGCCGCT 

674897 

Qy 292 GTTGCCACCTCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGA 335 

I I I I I I I I I I I I I I I I I I II I I I I I 

Db 674898 ACCGCCAGCCCCGCCGGCGCCGTGGCTGCCGCCGAGGCCGAGCA 674941 



Search completed: March 4, 2004, 09:18:33 
Job time : 106 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 
Run on: March 4, 2004, 08:34:04 



Search time 349 Seconds 

(without alignments) 

8488.596 Million cell updates/sec 



Title: 

Perfect score: 
Sequence : 



US-09-852-100B-1 
810 

1 atgcatattttaaaagggtc aaacgcaattatatccataa 810 



Scoring table: IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 

Searched: 2421054 seqs, 1828716029 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 



4842108 



Post-processing: 



Minimum Match 0% 
Maximum Match 100% 
Listing first 45 summaries 



Database 



Published_Applications_NA: * 

1: /cgn2_6/ptodata/2/pubpna/US07_PUBCOMB.seq:* 

2: /cgn2_6/ptodata/2/pubpna/PCT_NEW_PUB.seq:* 

3 : /cgn2_6/ptodata/2/pubpna/US06_NEW_PUB. seq: * 

4 : /cgn2_6/ptodata/2/pubpna/US06_PUBCOMB. seq: * 

5 : /cgn2_6/ptodata/2/pubpna/US07_NEW__PUB. seq: * 

6: /cgn2_6/ptodata/2/pubpna/PCTUS_PUBCOMB.seq:* 

7 : /cgn2_6/ptodata/2/pubpna/US08_NEW__PUB. seq: * 

8 : /cgn2_6/ptodata/2/pubpna/US08_PUBCOMB. seq: * 

9: /cgn2_6/ptodata/2/pubpna/US09A_PUBCOMB.seq:* 

10: /cgn2_6/ptodata/2/pubpna/US09B_PUBCOMB.seq:* 

11: / C gn2_6/ptodata/2/pubpna/US09C_PUBCOMB.seq:* 

12 : /cgn2_6/ptodata/2/pubpna/US09_NEW_PUB. seq: * 

13: / cgn2_6/ptodata/2 /pubpna/US 1 0A_PUBCOMB . seq : * 

14 : /cgn2_6/ptodata/2/pubpna/US10B_PUBCOMB . seq: * 

15: / cgn2_6/ptodata/2 /pubpna/US 1 0C_PUBCOMB . seq : * 

16: /cgn2_6/ptodata/2/pubpna/US10_NEW_PUB.seq:* 

17": /-cgn2-6yptodata-/-2-/pubpna-/-US60-NEW-PUB,-seq-:-* 

18 : /cgn2_6/ptodata/2/pubpna/US60_PUBCOMB. seq: * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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RESULT 1 

US-09-852-100A-1 

; Sequence 1, Application US/09852100A 
; Patent No. US20020058267A1 
; GENERAL INFORMATION: 

APPLICANT: American Home Products 



; TITLE OF INVENTION: Beta-amyloid Peptide-Binding Proteins and Polynucleotides 
Encoding the 

; TITLE OF INVENTION: Same 
; FILE REFERENCE: AHP981261p2 

; CURRENT APPLICATION NUMBER: US/09/852 , 100A 

; CURRENT FILING DATE: 2001-05-09 

; PRIOR APPLICATION NUMBER: US 09/172,990 

; PRIOR FILING DATE: 1998-10-14 

; PRIOR APPLICATION NUMBER: US 60/104,104 

; PRIOR FILING DATE: 1998-10-13 

; PRIOR APPLICATION NUMBER: PTC/US99/21621 

; PRIOR FILING DATE: 1999-10-13 

; PRIOR APPLICATION NUMBER: US 09/060,609 

; PRIOR FILING DATE: 1998-04-15 

; PRIOR APPLICATION NUMBER: US 60/064,583 

; PRIOR FILING DATE: 1997-04-16 

; NUMBER OF SEQ ID NOS : 2 

SOFTWARE: Patentln version 3.0 
; SEQ ID NO 1 

LENGTH: 810 

TYPE: DNA 
; ORGANISM: Homo sapiens 

FEATURE : 

NAME/ KEY: CDS 

LOCATION: (1) . . (807) 
US-09-852-100A-1 

Query Match 100.0%; Score 810; DB 9; Length 810; 

Best Local Similarity 100.0%; Pred. No. 1.8e-244; 

Matches 810; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 AT GCATATTTTAAAAGGGTCTCCCAAT GTGATT CCACGGGCT CACGGGCAGAAGAACACG 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 AT GCATATTTTAAAAGGGTCTCCCAAT GTGATT CCACGGGCT CACGGGCAGAAGAACACG 60 

Qy 61 CGAAGAGACGGAACTGGCCTCTATCCTATGCGAGGTCCCTTTAAGAACCTCGCCCTGTTG 120 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 61 CGAAGAGACGGAACTGGCCTCTATCCTATGCGAGGTCCCTTTAAGAACCTCGCCCTGTTG 120 

Qy 121 CCCTTCTCCCTCCCGCTCCTGGGCGGAGGCGGAAGCGGAAGTGGCGAGAAAGTGTCGGTC 180 

II I I I I I I I II I I I I I I M I I I I I II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 

Db 121 CCCTTCTCCCTCCCGCTCCTGGGCGGAGGCGGAAGCGGAAGTGGCGAGAAAGTGTCGGTC 180 

Qy 181 TCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGAGGCCGTGACGGCCAGA 24 0 

II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 TCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGAGGCCGTGACGGCCAGA 24 0 



Qy 241 CTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCTGTTGCCACC 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I M I I 
Db 241 CTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCTGTTGCCACC 300 

Qy 301 T C C GC C GGGGGCGAGGAGT C GCTTAAGT GC GAGGAC CT CAAAGT GGGACAAT AT ATTT GT 360 

I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I I I II I I I II I I I I I I I I I I I I M I I I I I 
Db 301 TCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCAAAGTGGGACAATATATTTGT 360 



QY 



361 AAAGATCCAAAAATAAATGACGCTACGCAAGAACCAGTTAACTGTACAAACTACACAGCT 42 0 
I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I 



Db 



361 AAAGATCCAAAAATAAATGACGCTACGCAAGAACCAGTTAACTGTACAAACTACACAGCT 420 



Qy 421 CAT GT T T CCT GTTT T CCAGC AC C C AACAT AACT T GT AAG GAT T C CAGT GGC AAT GAAACA 480 

I I M I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 CATGTTTCCTGTTTTCCAGCACCCAACATAACTTGTAAGGATTCCAGTGGCAATGAAACA 480 

Qy 481 CAT TT TACT GGGAACGAAGT TGGTTTTTT CAAGCCCAT AT CT T GC CGAAAT GT AAATGGC 540 

I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I 
Db 481 CAT TT TACT GGGAAC GAAGTT GGT T T T T T CAAGCCCAT ATCT T GC C GAAAT GT AAATGGC 540 

Qy 541 TATTCCTACAAAGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGA 600 

I I M I I I I I I I M II I I I I I I I M I I I M II I II II I I I I I I I I II I II I I M I II II II 
Db 541 TATTCCTACAAAGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGA 600 

Qy 601 TTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGA 660 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I II I I I I I I II I I I I I 
Db 601 TTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGA 660 

Qy 661 ATTGGGAGCCTAATTGATTTCATTCTTATTTCAATGCAGATTGTTGGACCTTCAGATGGA 720 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I II II I II I I I I I I I 
Db 661 AT T GGGAGCCTAATT GAT TT CAT T CT T AT T TCAAT GCAGATT GTT GGACCTT CAGAT GGA 720 

Qy 721 AGT AGT TACATTATAGAT TACT AT GGAAC CAGACTT ACAAGACT GAGT AT TACT AAT GAA 780 

I M I II I I I II M I II I I I I I M II M I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II 

Db 721 AGT AGT TACATTATAGAT TACT AT GGAAC CAGACTT ACAAGACT GAGT AT TACT AAT GAA 780 

Qy 7 81 AC AT TTAGAAAAACGCAAT TAT AT C CAT AA 810 

II I I I I I I II I I I II I I I I I I II I I I I I I I 

Db 781 AC AT TTAGAAAAACGCAAT TAT AT C CAT AA 810 



RESULT 2 

US-09-833-503A-1 

; Sequence 1, Application US/09833503A 

; Patent No. US200201467 60A1 

; GENERAL INFORMATION: 

; APPLICANT: Ozenberger, Bradley A 

; APPLICANT; Kajkowski, Eileen M 

; APPLICANT: Lo, Ching-Hsiung F 

; APPLICANT: American Home Products Corporation 

; TITLE OF INVENTION: No. US2002014 6760Alel G-Protein-Coupled Receptor-Like 
Proteins and 

; TITLE OF INVENTION: Polynucleotides Encoded By Them, and Methods of Using 

; TITLE OF INVENTION: Same 

; FILE REFERENCE: AHP98165-00PCT 

; CURRENT APPLICATION NUMBER: US/09/833, 503A 

"7 ^CURRENT - FILING - DATE : 2000—10—1-3 . 

; PRIOR APPLICATION NUMBER: 60/104,104 

; PRIOR FILING DATE: 1998-10-13 

; NUMBER OF SEQ ID NOS : 6 

; SOFTWARE: Patentln Ver . 2.1 

; SEQ ID NO 1 

; LENGTH: 810 

; TYPE: DNA 

; ORGANISM: Homo sapiens 

US-09-833-503A-1 



Query Match 100.0%; Score 810; DB 9; Length 810; 

Best Local Similarity 100.0%; Pred. No. 1.8e-244; 

Matches 810; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 1 ATGCATATTTTATWVGGGTCTCCCAATGTGATTCCACGGGCTCACGGGCAGAAGAACACG 60 

I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 AT GCAT AT T T T AAAAG G GT CT C C C AAT GT GAT T C CAC GGGCT C ACGGGCAGAAGAACAC G 60 

Qy 61 CGAAGAGACGGAACTGGCCTCTATCCTATGCGAGGTCCCTTTAAGAACCTCGCCCTGTTG 120 

I I II I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 
Db 61 CGAAGAGACGGAACTGGCCTCTATCCTATGCGAGGTCCCTTTAAGAACCTCGCCCTGTTG 120 

Qy 121 CCCTTCTCCCTCCCGCTCCTGGGCGGAGGCGGAAGCGGAAGTGGCGAGAAAGTGTCGGTC 180 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 CCCTTCTCCCTCCCGCTCCTGGGCGGAGGCGGAAGCGGAAGTGGCGAGAAAGTGTCGGTC 180 

Qy 181 TCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGAGGCCGTGACGGCCAGA 240 

I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I I I II I I I I I I I I I I I II II I I I I II I I 
Db 181 TCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGAGGCCGTGACGGCCAGA 240 

Qy 241 CTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCTGTTGCCACC 300 

I II I I I I II I II I I I I I II II I I I I I I I I M I I I M I I I I I M I II I I I I ( I I M II I II 
Db 241 CTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCTGTTGCCACC 300 

Qy 301 TCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCAAAGTGGGACAATATATTTGT 360 

I I I I I I I II I I I I I I I I I II I I I I I II I I I I I I I I I II I I I I I I I I I I I II I I I II I I I I 
Db 301 T CCGCCGGGGGCGAGGAGTCGCTTAAGT GCGAGGACCT CAAAGTGGGACAATATATTT GT 360 

Qy 361 AAAGATCCAAAAATAAATGACGCTACGCAAGAACCAGTTAACTGTACAAACTACACAGCT 420 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I II I I I I I I I I I I I I I I I I 
Db 361 AAAGAT C CAAAAATAAAT GAC GC TACGCAAGAACC AGT TAAC T GTACAAACTACACAGCT 420 

Qy 421 CAT GT T T C CT GTTT T CCAGCAC C CAACAT AACTT GTAAGGAT T C CAGT GGCAAT GAAACA 480 

I I I I M II I I II I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I 
Db 421 CAT GTT T CCTGTT T T CCAGCACCCAAC AT AACTT GTAAGGATT CCAGT GGCAAT GAAACA 480 

Qy 4 81 CATTTTACTGGGAACGAAGTTGGTTTTTTCAAGCCCATATCTTGCCG^AATGTAAATGGC 540 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 4 81 CATTTTACTGGGAACGAAGTTGGTTTTTTCAAGCCCATATCTTGCCGAAATGTAAATGGC 540 

Qy 541 TATT CCTACAAAGTGGCAGT CGCATTGT CT CTTTTT CTTGGAT GGTTGGGAGCAGATCGA 600 

I I I I I I I I I I I I I II II I I I II I I M I I I I I I M I I I I I I II I I I I I II M ( I II I I I I I 
Db 541 TATT CCTACAAAGT GGCAGT CGCATTGTCT CTTTTT CTTGGATGGTTGGGAGCAGAT CGA 600 

Qy 601 TTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGA 660 

I I I I I I I I II I I I I II I I I II II I I I I I I I I I I II I II I II I I I I I I I I M I I II I I I I I 

"Db 601~ TTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGT-T-T-T^ 

Qy 661 ATT GGGAGCCT AATT GATT T CATT CTT AT T TCAAT GCAGAT T GT T GGACCTT CAGAT GGA 720 

I I I I I II I I I I I I I II I I I I I II I I I I II I II II II II I I II I I I I I I II I II I I I I I I I 
Db 661 ATTGGGAGCCTAATTGATTTCATTCTTATTTCAATGCAGATTGTTGGACCTTCAGATGGA 720 

Qy 721 AGT AGT T ACAT T AT AGATT ACT AT GGAAC CAGACT T ACAAGACT GAGT AT T ACTAAT GAA 780 

I I I M M I I I I M M I M M II I I II I M II M I M I I I I I II M I I M I II I I I I M M 

Db 721 AGTAGTTACATTAT AGATTACTATGGAACCAGACTTACAAGACT GAGTATTACTAAT GAA 780 

Qy 781 ACAT T T AG AAAAAC G C AAT TAT AT C C AT AA 810 



Db 



I I I I I I I I I I I I I I I I I I I I I I 1 II I II I I 
781 ACAT T T AGAAAAAC G C AAT TAT AT C C AT AA 810 



RESULT 3 
US-10-199-881-1 

; Sequence 1, Application US/10199881 
; Publication No. US20030096356A1 
; GENERAL . INFORMATION : 
; APPLICANT: Wyeth 

; TITLE OF INVENTION: No. US20030096356Alel G-Protein-Coupled Receptor-Like 
Proteins and Polynucleotides 

; TITLE OF INVENTION: Encoded by Them, and Methods of Using Same" 
; FILE REFERENCE: AHP98165C1 

; CURRENT APPLICATION NUMBER: US/10/199,881 

; CURRENT FILING DATE: 2002-07-18 

; PRIOR APPLICATION NUMBER: PCT/ US99/21621 

; PRIOR FILING DATE: 1999-10-13 

; PRIOR APPLICATION NUMBER: US 90/833,5081 

; PRIOR FILING DATE: 2001-12-04 

; PRIOR APPLICATION NUMBER: US 60/104,104 

; PRIOR FILING DATE: 1998-10-13 

; NUMBER OF SEQ ID NOS: 45 

; SOFTWARE: Patentln version 3.1 

; SEQ ID NO 1 

LENGTH: 810 

TYPE: DNA 
; ORGANISM: Homo sapiens 
; FEATURE : 

NAME/ KEY: CDS 

LOCATION: (1) . . (810) 

OTHER INFORMATION: 
US-10-199-881-1 

Query Match 100.0%; Score 810; DB 14; Length 810; 

Best Local Similarity 100.0%; Pred. No. 1.8e-244; 

Matches 810; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 ATGCATATTTTAAAAGGGTCTCCCAATGTGATTCCACGGGCTCACGGGCAGAAGAACACG 60 

I I I II I I I I I M I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I II 
Db 1 AT GCAT AT T TTAAAAGGGT CT C C CAATGT GATT CCACGGGCT CAC GGGCAGAAGAACAC G 60 

Qy 61 CGAAGAGACGGAACT GGCCTCTATCCTAT GCGAGGT CCCTTTAAGAACCT CGCCCTGTT G 120 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I M I I II I I I II I I I I I II I I I I I I I I I I I I I 

Db 61 CGAAGAGACGGAACTGGCCTCTATCCTATGCGAGGTCCCTTTAAGAACCTCGCCCTGTTG 120 

Qy 121 C CCTT GT C CCT CC G GGTGC^T GGGGGGAGGC GGAAGC GGAAGTGGC GAGAAAGT 

II M I I I I M I M I I I II I II I I I I I I I II I I I I I I I II I I I I I II I I I I I II I I I I I I I 

Db 121 CCCTTCTCCCTCCCGCTCCTGGGCGGAGGCGGAAGCGGAAGTGGCGAGAAAGTGTCGGTC 180 

Qy 181 T C CAAGAT GGC GGC C GCCT GGC C GT CT GGT C CGT C T GCTC C GGAGGC C GT GAC GGCCAGA 240 

I I I I I M I I I I I II I I I I II I I I I I I I I I I I I II I II I M II I II I I II I II I I I I I I I I 
Db 181 TCCAAGATGGCGGCCGCCTGGCCGTCTGGTCCGTCTGCTCCGGAGGCCGTGACGGCCAGA 240 

Qy 241 CTCGTTGGTGTCCTGTGGTTCGTCT CAGT C ACT AC AGGAC CCTGGGGGGCT GT T GC C AC C 300 

I I I I I I I I II I I I I I II I I I I II I I I II I I I I I M I I I I I M I I I I I I M I I I I I I M II 
Db 241 CTCGTTGGTGTCCTGTGGTTCGTCTCAGTCACTACAGGACCCTGGGGGGCTGTTGCCACC 300 



Qy 301 TCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCAAAGTGGGACAATATATTTGT 360 

I I I I I I I I I I I 1 I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II M 
Db 301 TCCGCCGGGGGCGAGGAGTCGCTTAAGTGCGAGGACCTCAAAGTGGGACAATATATTTGT 360 

Qy 361 AAAGAT CCAAAAAT AAAT GAC GCT AC GCAAGAACCAGT T AACT GTACAAACT ACACAGCT 420 

I I I I I I I I I II I II I I I I I I I II I I I I I I I I I I I I I I I I I II I I I II I I I I I II I II M I 
Db 361 AAAGAT CCAAAAAT AAAT GAC GCT AC GCAAGAACCAGT T AACT GTACAAACT ACACAGCT 420 

Qy 421 CAT GTTTCCTGTTTTC CAG CAC C C AAC AT AAC T T GT AAG GAT T C CA GT G G C AAT GAAAC A 480 

I I I I I I I I I I II I I I I I I II I I M I I I I I I I I I I I I I I I I II I I II I II I I I I I I I II II 

Db 421 CATGTTTCCTGTTTTCCAGCACCCAACAT7VACTTGTAAGGATTCCAGTGGC7VATGAAACA 480 

Qy 481 CAT T TT ACTGGGAACGAAGT TGGTTTTTT CAAGCCCAT AT CTT GCC GAAAT GTAAAT GGC 540 

I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 481 CAT T TT ACT GGGAACGAAGT TGGTTTTTT CAAGCCCAT AT CTT GCC GAAAT GTAAAT GGC 540 

Qy 541 TAT T CCTACAAAGT GGCAGT C GCAT T GT CTCTTTTT CTT GGAT GGT T GGGAGCAGAT CGA 600 

I I I I I II I II M II I I I I I M I M II I M I II I M M M M I II II I I M M I I I II M I 

Db 541 T ATT CCTACAAAGT GGCAGT C GCAT T GT CTCTTTTT CTT GGAT GGT T GGGAGCAGAT CGA 600 

Qy 601 TTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGA 660 

I II I I I I II I I I I II I I I I II I I I I I I I I I I I I I I I II I II I I I I I I I I I II I I I I I I I I 
Db 601 TTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGA 660 

Qy 661 ATT GGGAGCCTAATTGATTT CATTCTTATTTCAATGCAGATT GTT GGAC CTT C AGATGGA 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I II I I I I I II I I 
Db 661 AT T GGGAGCCTAATT GAT TT CAT T CT T AT T TCAAT GCAGATT GT T GGAC CTT CAGAT GGA 720 

Qy 721 AGT AGT T ACATTATAGATT ACT AT GGAAC CAGACTT ACAAGACT GAGT AT TACT AAT GAA 780 

I I I II I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I 
Db 721 AGTAGTTACATTATAGATTACTAT GGAAC CAGACTT ACAAGACT GAGTATTACTAATGAA 780 

Qy 781 AC AT T T AGAAAAAC G C AAT TAT AT C C AT AA 810 

I I I I I I I I I I I I I II I I I I I I I I I I II I I I 
Db 7 81 ACATTTAGAAAAACGCAATTATATCCATAA 810 



RESULT 4 

US-09-922-217-233 

Sequence 233, Application US/09922217 
Patent No. US20020076414A1 
GENERAL INFORMATION: 
APPLICANT: Xu, Jiangchun 

Lodes, Michael J. 
Secrist, Heather 
B enso ir, — D a rin— R^ 



APPLICANT: 
APPLICANT: 
-APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
TITLE OF INVENTION: 
TITLE OF INVENTION: 



Meagher, Madeleine Joy 
Stolk, John A. 
Wang, Tongtong 
Jiang, Yuqiu 
Smith, Carole Lynn 
King, Gordon E. 
Wang, Aijun 
Clapper, Jonathan D. 

COMPOUNDS FOR IMMUNOTHERAPY AND DIAGNOSIS 
OF COLON CANCER AND METHODS FOR THEIR USE 



FILE REFERENCE: 210121 . 471C13 
CURRENT APPLICATION NUMBER: US/09/922, 217 
CURRENT FILING DATE: 2001-08-03 
NUMBER OF SEQ ID NOS: 112 4 
SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 233 
LENGTH: 508 
TYPE: DNA 

ORGANISM: Homo sapiens 
US-09-922-217-233 

Query Match 61.6%; Score 499; DB 9; Length 508; 

Best Local Similarity 100.0%; Pred. No. 1.8e-146; 

Matches 499; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 312 CGAGGAGT CGCTTAAGT GCGAGGACCTCAAAGTGGGACAATATATTTGTAAAGATCCAAA 371 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 CGAGGAGT CGCTTAAGT GCGAGGACCTCAAAGT GGGACAATATATTTGTAAAGAT CCAAA 60 

Qy 372 AAT AAAT GACGCT AC GCAAGAACCAGTT AACT GT ACAAACT ACACAGCT CAT GTT T C CT G 431 

I I M I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I II I I I I I I 
Db 61 AAT AAAT GACGCT AC GCAAGAACCAGTT AACT GT ACAAACT ACACAGCT CAT GTT T CCT G 120 

Qy 432 T T T T C C AGCAC C C AAC AT AACT T GT AAG GAT T C C AGT GG C AAT GAAAC AC AT T T TACT G G 491 

I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I I I II I I I II I I I I I I I I 
Db 121 TTTTC CAGCAC C CAACATAACTTGTAAGGAT TC C AGTG GCAATGAAACACAT T T TAC T GG 180 

Qy 492 GAACGAAGT TGGTTTTTT CAAGCCCAT AT CTT GC C GAAAT GTAAATGGCTAT T CCT ACAA 551 

I 1 I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I II M I I I I I I I I I I II 
Db 181 GAACGAAGTTGGTTTTTTCAAGCCCATATCTTGCCGAAATGTAAATGGCTATTCCTACAA 240 

Qy 552 AGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGATTTTACCTTGG 611 

I I I I I I I I I I I M I I I M I II I M I I I I M M I II I I I I I II I II I II I I M M I M I M 

Db 241 AGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGATTTTACCTTGG 300 

Qy 612 ATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCT 671 

II I II II I I I I I II I I I I I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 

Db 301 ATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCT 360 

Qy 672 AATTGATTTCATTCTTATTTCAATGCAGATTGTTGGACCTTCAGATGGAAGTAGTTACAT 731 

I I I I II I I I I I I I I I II I I II I I I I II I I I II I II I I I I I I II II I II I I I I I I I I M II 

Db 361 AATT GAT T T CAT T CT TAT T T CAATGCAGATT GT T GGACCT TCAGAT GGAAGTAGT T ACAT 420 

Qy 732 TAT AGAT TACT AT GGAAC CAGACTT ACAAGACT GAGTATT ACTAATGAAACAT T T AGAAA 791 

II I I I I II M I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I I I I II I I I I I I I II 

Db 421 TATAGATTACTAT GGAACCAGACTTACAAGACT GAGTATTACTAATGAAACATTTAGAAA 480 



Qy 792 AACGCAATTATATCCATAA 810 

I I I I I I I I I I I I I I I I II I 
Db 481 AACGCAATTATATCCATAA 4 99 



RESULT 5 

US-09-922-217-245 

; Sequence 245, Application US/09922217 
; Patent No. US20020076414A1 
; GENERAL INFORMATION: 



APPLICANT: Xu, Jiangchun 
APPLICANT: Lodes, Michael J. 
APPLICANT: Secrist, Heather 
APPLICANT: Benson, Darin R. 
APPLICANT: Meagher, Madeleine Joy 
APPLICANT: Stolk, John A. 
APPLICANT: Wang, Tongtong 
APPLICANT: Jiang, Yuqiu 
APPLICANT: Smith, Carole Lynn 
APPLICANT: King, Gordon E. 
APPLICANT: Wang, Aijun 
APPLICANT: Clapper, Jonathan D . 

TITLE OF INVENTION: COMPOUNDS FOR IMMUNOTHERAPY AND DIAGNOSIS 
TITLE OF INVENTION: OF COLON CANCER AND METHODS FOR THEIR USE 
FILE REFERENCE: 210121 . 471C13 
CURRENT APPLICATION NUMBER: US/ 09/ 922 , 2 17 
CURRENT FILING DATE: 2001-08-03 
NUMBER OF SEQ ID NOS : 1124 

SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 245 
LENGTH: 508 
TYPE: DNA 

ORGANISM: Homo sapiens 
US-09-922-217-245 

Query Match 61.6%; Score 499; DB 9; Length 508; 

Best Local Similarity 100.0%; Pred. No. 1.8e-146; 

Matches 499; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

CGAGGAGTCGCTTAAGT GCGAGGACCT CAAAGT GGGACAATATATTTGTAAAGATCCAAA 371 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I M I I I 
CGAGGAGTCGCTTAAGT GCGAGGACCTCAAAGT GGGACAATAT ATTTGTAAAGAT CCAAA 60 

AAT AAAT GAC GCT AC GCAAGAAC CAGT TAACT GT ACAAACT ACACAGCT CATGTTT CCTG 431 
I M I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 
AAT AAAT GACGCTACGCAAGAAC CAGTT AACT GTACAAACT AC ACAGCT CAT GT T T C CT G 120 

T T TT C CAGCACC CAAC ATAACT T GT AAGGAT T CCAGT GGCAAT GAAACAC ATTT T ACT GG 491 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I 

TTTTCCAGCACCCAACATAACTTGTAAGGATT CCAGT GGCAAT GAAACACATTTTACTGG 180 

GAACGAAGTTGGTTTTTTCAAGCCCATATCTTGCCGAAATGT7VAATGGCTATTCCTACAA 551 

I | | | I I I I I I I I I I I I I I II I I II I I I I II I I I I I I I I I I I I M I I I I I I I I M I I I I I I 
GAACGAAGTT GGT TT T T T CAAGC CC AT AT CTT GC C GAAAT GTAAAT GGCT ATT C CT ACAA 240 

AGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGATTTTACCTTGG 611 

l-H-l-l-l-l-l-H-l-IH-H-l-l-l-l-l-l-l-l-I-t-l-l-H-l-H 



Qy 


312 


Db 


1 


Qy 


372 


Db 


61 


Qy 


432 


Db 


121 


Qy 


492 


Db 


181 


Qy 


552 


Db 


241 


Qy 


612 


Db 


301 


Qy 


672 


Db 


361 



AGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGATTTTACCTTGG 300 

ATACCCTGCTTTGGGTTTGTT7WVGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCT 67 1 

| | | | I I I || I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I 
ATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCT 360 

AATT GATTT CATTCTTATTTCAATGCAGATT GTTGGACCTTCAGAT GGAAGTAGTT ACAT 731 

I M I I II I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I M I I II I I I 

AATT GAT T T CATT CT TAT T TCAAT GCAGAT T GTT GGAC CT T C AGAT GGAAGTAGT T ACAT 420 



Qy 732 T AT AGAT T ACTAT GGAACCAGACT TACAAGACT GAGT ATT ACT AAT GAAACAT TTAGAAA 7 91 

I I I I I M I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I M I I I ! I I I 
Db 421 TAT AGATTACT AT GGAACCAGACT TACAAGACT GAGT ATT ACTAAT GAAACAT T T AGAAA 4 80 

Qy 792 AAC GC AAT TAT AT C C AT AA 810 

I I I I I I I I I I I I I I I II I I 
Db 481 AAC GCAAT TAT AT C C AT AA 499 



RESULT 6 

US-09-833-263-233 

Sequence 233, Application US/09833263 
Patent No. US20020110547A1 
GENERAL INFORMATION: 
APPLICANT: Wang, Aijun 
APPLICANT: Clapper, Jonathan D. 
APPLICANT: Stolk, John A. 
APPLICANT: Meagher, Madeleine J. 

TITLE OF INVENTION: COMPOUNDS FOR IMMUNOTHERAPY AND 

TITLE OF INVENTION: DIAGNOSIS OF COLON CANCER AND METHODS FOR THEIR USE 
FILE REFERENCE: 2 10121 . 471C12 
CURRENT APPLICATION NUMBER: US/09/833, 263 
CURRENT FILING DATE: 2001-04-10 
NUMBER OF SEQ ID NOS : 1093 

SOFTWARE: FastSEQ for Windows Version 3.0 
SEQ ID NO 233 
LENGTH: 508 
TYPE: DNA 

ORGANISM: Homo sapien 
US-09-833-263-233 

Query Match 61.6%; Score 499; DB 9; Length 508; 

Best Local Similarity 100.0%; Pred. No. 1.8e-146; 

Matches 499; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Q y 312 CGAGGAGT CGCT TAAGT GC GAGGAC CTCAAAGT GGGACAAT ATATT T GT AAAGAT CCAAA 371 

| | | | | | | | | I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I M I I I I I I I II I M I I 
Db 1 C GAGGAGT CGCT TAAGT GC GAGGACCT CAAAGT GGGACAAT AT AT T TGTAAAGAT CCAAA 60 

Qy 372 AATAAAT GACGCTACGCAAGAACCAGTTAACT GT ACAAACT ACACAGCTCATGTTTCCT G 431 

M I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I M I I I I I I 
D b 61 AATAAAT GACGCTACGCAAGAACCAGTTAACTGTACAAACT ACACAGCT CAT GTTT CCTG 120 

Qy 432 T TT T C CAGCAC CCAACATAACTT GT AAGGAT T CCAGT GGCAAT GAAAC ACAT T TT ACT GG 4 91 

| | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I 
Db 121 TTTT CCAGCACCCAACATAACTT GTAAGGATT CCAGT GGCAAT GAAACACATTT TACT GG 180 

Qy 4 92 GAAC GAAGTT GGT TT T TT CAAGC C CAT AT CT T GC C GAAAT GT AAAT GGCT ATTC CT ACAA 551 

| | | | I I I I I I I I I I I II I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 GAACGAAGT TGGTTTTTT CAAGCC CAT AT CT T GC CGAAAT GTAAATGGCT ATT C CT ACAA 240 

Qy 552 AGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGATTTTACCTTGG 611 

I I I I M I I I I I I M I I I II I II I I I I II I I II I M I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 AGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGATTTTACCTTGG 300 



QY 



612 



ATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCT 
| | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I M I II I I I I 



671 



Db 



301 ATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCT 360 



Qy 672 AATT GAT T T C ATT CT TAT TT CAAT GC AGAT T GT T GGAC CT T CAGAT G GAAGT AGTT AC AT 731 

I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 AATTGATTTCATT CTTATTT CAAT GCAGATT GTT GGACCTT CAGATGGAAGTAGTTACAT 420 

Qy 732 TAT AGAT TACT AT GGAAC C AGACT T ACAAGACT GAGT AT T ACT AAT GAAACATTT AGAAA 791 

I | | | M | I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 TAT AGAT TACT AT GGAAC CAGACT T AC AAGACT GAGT AT TACT AAT GAAAC AT T T AGAAA 4 80 

Qy 7 92 AAC G CAAT TAT AT C C AT AA 810 

I I I I I I I I I I I II I I I I I I 
Db 4 81 AAC GCAATT AT AT CCATAA 499 



RESULT 7 

US-09-833-263-245 

Sequence 245, Application US/09833263 
Patent No. US20020110547A1 
GENERAL INFORMATION: 
APPLICANT: Wang, Aijun 
APPLICANT: Clapper, Jonathan D. 
APPLICANT: Stolk, John A. 
APPLICANT: Meagher, Madeleine J . 

TITLE OF INVENTION: COMPOUNDS FOR IMMUNOTHERAPY AND 

TITLE OF INVENTION: DIAGNOSIS OF COLON CANCER AND METHODS FOR THEIR USE 
FILE REFERENCE: 210121 . 471C12 
CURRENT APPLICATION NUMBER: US/09/833, 263 
CURRENT FILING DATE: 2001-04-10 
NUMBER OF SEQ ID NOS : 1093 

SOFTWARE: Fast SEQ for Windows Version 3.0 
SEQ ID NO 245 
LENGTH: 508 
TYPE: DNA 

ORGANISM: Homo sapien 
US-09-833-263-245 

Query Match 61.6%; Score 499; DB 9; Length 508; 

Best Local Similarity 100.0%; Pred. No. 1.8e-146; 

Matches 499; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 312 CGAGGAGT CGCTTAAGTGCGAGGACCTCAAAGT GGGACAATATATTT GTAAAGATCCAAA 371 

| M | | | I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I 
D b 1 C GAGGAGT C GCT T AAGT GC GAGGAC CT CAAAGT GGGACAATAT AT TT GT AAAGAT C CAAA 60 

Qy 372 AATAAATGACGCTACGCAAGAACCAGTTAACTGTACAAACTACACAGCTCATGTTT CCT G 431 

-|ii-Mii-M-Miiii1-|-|-|^^ 



Db 61 AAT AAAT GAC GCTAC GC AAGAAC CAGTT AACT GTACAAACT ACAC AGCT CAT GT T T CCT G 120 

Qy 4 32 T T TT C CAGCAC C CAACAT AACT T GTAAGGAT T C CAGT GG CAAT GAAAC ACATTTT ACT GG 4 91 

| | | | I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I II I I I I I I I 
Db 121 TT T T C CAGCAC C CAACAT AACTT GTAAGGAT T CCAGT GGCAAT GAAAC ACATT TT ACT GG 18 0 

Qy 492 GAAC GAAGTT GGTT TTT T CAAGCC CAT AT CT T GCC GAAAT GT AAAT GGCT AT T CCT ACAA 551 

| | | | | | | I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I M I I M M I I I I II I I 
Db 181 GAAC GAAGT TGGTTTTTT CAAGC C CAT AT CT T GC C GAAAT GT AAAT GGCTAT T CCTACAA 240 



Qy 552 AGT GGC AGT C GC AT TGT CTCTTTTTCTT GGAT GGT T G GGAGCAGAT CGAT T T TAC CT T GG 611 

I I I M | I I I I I I M I I I I I I I I I I I I I I I I I I II I II I I I I I II I I I I I I I I M M M I I 
Db 241 AGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGATTTTACCTTGG 300 

Qy 612 ATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCT 671 

| | | | | | | | | | | | I I I I I I I I I I I I I I I II II I I I I I I I I I I I II I II I I I I I I I I I I I I I 
Db 301 ATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCT 360 

Qy 672 AAT T GAT TT CATT CTT ATT T CAAT GC AGATT GT T GGAC CTT CAGAT GGAAGTAGT TACAT 731 

| | | | | | | | | | I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I 
Db 361 AAT T GAT T T CAT T CT TAT T T CAAT G CAGAT T GT T G GAC CTT CAGAT G G AAGT AGT TACAT 420 

Qy 732 TATAGATTACTAT GGAACCAGACTTACAAGACT GAGTATTACTAATGAAACATTTAGAAA 791 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 
Db 421 TATAGATTACTAT GGAACC AGACT TACAAGACT GAGTAT T ACTAAT GAAACATT TAGAAA 480 

Qy 7 92 AACGCAATT ATAT CCATAA 810 

I I I I I I I I I I I I I I I I I M 
Db 481 AAC GC AAT TAT AT CCATAA 499 



RESULT 8 

US-10-025-380-233 

; Sequence 233, Application US/10025380 
; Publication No. US20020182191A1 
; GENERAL INFORMATION: 



APPLICANT: 


Xu, Jian 


gchun 


APPLICANT: 


Lodes , 


Michael J. 


APPLICANT: 


Secrist 


, Heather 


APPLICANT: 


Benson, 


Darin R. 


APPLICANT: 


Meagher 


, Madeleine Joy 


APPLICANT: 


Stolk, 


John A. 


APPLICANT: 


Wang, Tongtong 


APPLICANT 


Jiang, 


Yuqiu 


APPLICANT 


Smith, 


Carole L. 


APPLICANT 


King, Gordon E. 


APPLICANT 


Wang, Aijun 


APPLICANT 


Clapper 


, Jonathan D. 


APPLICANT 


: Skeiky, 


Yasir A. W. 


APPLICANT 


: Fanger, 


Gary R, 


APPLICANT 


: Vedvick 


Thomas S. 


APPLICANT 


: Carter, 


Darrick 



; TITLE OF INVENTION: COMPOUNDS FOR IMMUNOTHERAPY AND DIAGNOSIS 
; TITLE OF INVENTION: OF COLON CANCER AND METHODS FOR THEIR USE 
; FILE REFERENCE: 210121 . 47 1C14 
; CURRENT APPLICATION NUMBER: US/10/025, 380 

-; CURRENT - FILING - DAT E : 20 0-1— 1-2-1-9 

; NUMBER OF SEQ ID NOS : 1129 

; SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 233 

LENGTH: 508 

TYPE: DNA 
; ORGANISM: Homo sapiens 
US-10-025-380-233 



Query Match 61.6%; Score 499; DB 13; Length 508; 

Best Local Similarity 100.0%; Pred. No. 1.8e-146; 



Matches 499; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 


312 


C GAGGAGT C GCT TAAGT GC GAGGAC CT CAAAGT GG GACAAT AT AT T T GTAAAGAT CCAAA 


371 




i i i i i i i i i i i i i i i i i i i i i i I t I I I i i i I I I I l l 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 I 
1 1 II 1 1 IE 1 1 1 1 M 1 1 1 1 1 1 1 1 II 1 II 11 1 1 II 1 1 1 1 II 1 1 II II 1 1 1 1 1 II II li i i i i 




Db 


1 


CGAGGAGICGC1 1 AAG1 GCbAGGACUlUAAAtjl (jijtj/\L./\/\l/\l>\i 1 1 1 t\i\t\sji\}. ^KsJ^is^ 


60 


Qy 


372 


AATAAATGACGCTACGCAAGAACCAGTTAACT GTACAAACT ACACAGCTCATGTTT CCT G 


431 




i > i i i i i i i i i t i i i i i i i i i i i i i i i i i i i i i i I i I l l l l l l 1 1 1 1 1 1 1 1 1 1 II 1 1 E 1 1 
II 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 




Db 


61 


AATAAATGACGCI ACGCAAGAAL-L-AGl lAAL 1 b 1 ALAAAL 1 AbL Jl U/\± bl 1 J. J- ^ 


120 


Qy 


432 


TT T T C CAGCAC C CAAC ATAACT T GT AAGGATT CC AGT GGCAAT GAAAC ACAT TTT ACT GG 


491 




it t i i i i i i i i i i i i I I I I l l l l t t II 1 1 1 II 1 1 1 1 1 1 1 1 1 

I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 M 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


121 


TTT TC CAGCAC CCAACATAAC1 IGIAAbGAI 1 LLAb 1 bbbAAl bAAALALAl 1 1 


180 


Qy 


492 


GAAC GAAGT T GGTTT TTT CAAGCCCATAT CT T GC C GAAATGTAAAT GGCTAT T C CT ACAA 


551 




i i i i i i i i i i i i i i i i i i i i i t i i i i i i i i i t i i i i i 1 1 t 1 1 1 t 1 1 [ 1 1 1 1 1 1 1 1 1 1 1 1 1 
I | | | | | | | | | | | | 1 II 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


181 


r^-n. -n /--</-• t\ t jnmmprTnmmmmmp^ TiPPPPATiArnPTfTPPPP A A TVTPTTV AATPPPTIiTTPPTZirZll A 

GAACGAAGTTGGTTTTT 1 CAAGCCLAlAlL-i ibbLbAAAlbiAAAibbbiAi itbiA^AA 


24 0 


Qy 


552 


AGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGAGCAGATCGATTTTACCTTGG 


611 




i i i i i i i i i i i t i i i i i i i i i l l l l 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 t 1 1 1 1 1 1 1 1 ! 1 1 t 1 

| | | | | | | || I I II I 1 II 1 1 II 1 1 1 1 ( 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 11 1 1 1 




Db 


241 


AGTGGCAGTCGCATTGTCTCTTTTTCi. IGGAIGGI 1 GUGAGUAtjAi UbAl 1 1 iAbbi 




Qy 


612 


ATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCT 


671 




i i i i i < i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i 1 1 1 1 1 1 1 1 1 1 1 1 1 

I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 IE 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


301 


ATACCCTGCTTTGGGTTTGTTAAAGTTTTGCAC1G1 AGGG1 1 1 IblbbAAl ibbbAbbbi 


O D U 


Qy 


672 


AATT GATTTCATT CTTATTT CAATGCAGATT GTT GGACCTT CAGATGGAAGTAGTTACAT 


731 




I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 M 1 1 1 1 1 1 1 




Db 


361 


AAT T GAT T T C ATT CTT AT TT CAAT GCAGAT T GTT GGAC CT T CAGAT GGAAGTAGT TAC AT 


420 


Qy 


732 


TAT AGATTACT AT GGAACCAGACT TACAAGACT GAGT AT T ACT AATGAAACATT TAGAAA 


791 




1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II I 1 1 1 1 1 1 1 1 1 1 1 1 I M 




Db 


421 


TAT AGAT TACTAT GGAAC CAGACT TACAAGACT GAGT ATT ACTAAT GAAAC ATTTAGAAA 


480 


Qy 


7 92 


AACGCAATTATAT CCATAA 810 




Db 


481 


1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
AAC GCAATT AT AT CCATAA 499 





RESULT 9 

US-10-025-380-245 

Sequence 245, Application US/10025380 
Publication No. US20020182191A1 
GENERAL INFORMATION : 



APPLICANT 
APPLICANT 
"APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 



Xu, Jiangchun 

Lodes, Michael J. 
— Seer is t-^Hea ther — 

Benson, Darin R. 

Meagher, Madeleine Joy 

Stolk, John A. 

Wang,- Tongtong 

Jiang, Yuqiu 

Smith, Carole L. 

King, Gordon E. 

Wang, Aijun 

Clapper, Jonathan D. 

Skeiky, Yasir A. W. 



APPLICANT: Fanger, Gary R. 
APPLICANT: Vedvick Thomas S. 
APPLICANT: Carter, Darrick 

TITLE OF INVENTION: COMPOUNDS FOR IMMUNOTHERAPY AND DIAGNOSIS 
TITLE OF INVENTION: OF COLON CANCER AND METHODS FOR THEIR USE 
FILE REFERENCE: 210121 . 471C14 
CURRENT APPLICATION NUMBER: US/10/025, 380 
CURRENT FILING DATE: 2001-12-19 
NUMBER OF SEQ ID NOS : 1129 

SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 245 
LENGTH: 508 
TYPE: DNA 

ORGANISM: Homo sapiens 
US-10-025-380-245 

Query Match 61.6%; Score 499; DB 13; Length 508; 

Best Local Similarity 100.0%; Pred. No. 1.8e-146; 

Matches 499; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 312 CGAGGAGTCGCTTAAGT GCGAGGACCT CAAAGTGGGACAATATATTT GTAAAGAT CCAAA 371 

| | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M > I I I I I I I I I 
Db 1 C GAGGAGTC GCTTAAGT GCGAGGACCT CAAAGT GGGACAAT AT AT TT GTAAAGAT C CAAA 60 

Qy 372 AAT AAAT GACGCT ACGCAAGAAC C AGT T AACT GT ACAAACT ACACAGCT CAT GTT T C CT G 431 

| | M I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 AAT AAAT GACGCT ACGCAAGAAC CAGTTAACTGT ACAAACT AC ACAGCT CAT GT T T C CT G 120 

Qy 432 TTTT CCAGCACCCAACATAACTT GTAAGGATTCCAGT GGCAATGAAACACATTTTACT GG 491 

| I I I I M I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 TT T T C C AGCACC CAAC AT AACTT GT AAGGAT T C CAGT GGCAAT GAAACACAT TT TACT GG 180 

Qy 492 GAACGAAGT T GGT TTT T T CAAGC C CAT AT CT T GC CGAAAT GTAAATGGCT ATT CCT ACAA 551 

I I I I I I I I I I I I I I I I I I I I M I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I M I I 
Db 181 GAACGAAGT T GGTT TTT TCAAGC C CAT AT CTT GC CGAAAT GT AAAT GGCT AT T C CT ACAA 240 

Qy 552 AGT GGCAGT CGCATTGT CT CT TTTT CTT GGATGGTT GGGAGCAGAT CGATTTTACCTT GG 611 

I I I M I I I I I I I I I I I I I I I I I I M I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
D b 241 AGT GGCAGT C GCATT GT CT CT TT T TCTTGGAT GGTT GGGAGCAGAT C GAT T TT AC CTT GG 300 

Qy 612 ATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCT 671 

I || | | | | | | | | I I I I I I I I I I I I I I I I M II I II I I M I II I I I I II M I I I I I I I I I I I 
Db 301 ATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGGTTTTGTGGAATTGGGAGCCT 360 

Qy 672 AATTGATTTCATTCTTATTTCAATGCAGATTGTTGGACCTTCAGATGG^lAGTAGTTACAT 731 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I M I I I I I 

-Db 361- 



-420- 

Qy 732 TATAGATTACTATGGAACCAGACTTACAAGACTGAGTATTACTAATGAAACATTTAGAT^V 791 

I M I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 
Db 421 TATAGATTACTAT GGAACCAGACTTACAAGACTGAGTATTACTAAT GAAACATTTAGAAA 480 

Qy 792 AACG C AAT TAT AT C C AT AA 810 

I I I I I I I I I I I I I I I I I M 
Db 481 AACGCAATTATAT CCATAA 499 



RESULT 10 

US-09-918-995-6918 

; Sequence 6918, Application US/09918995 

; Publication No. US20030073623A1 

; GENERAL INFORMATION: 

; APPLICANT: Hyseq, Inc. 

; TITLE OF INVENTION: NOVEL NUCLEIC ACID SEQUENCES OBTAINED 
; TITLE OF INVENTION: FROM VARIOUS cDNA LIBRARIES 
; FILE REFERENCE: 20411-756 

; CURRENT APPLICATION NUMBER: US/ 09/ 918 , 995 

; CURRENT FILING DATE: 2001-07-30 

; PRIOR APPLICATION NUMBER: US/09/235, 076 

; PRIOR FILING DATE: 1999-01-20 

; NUMBER OF SEQ ID NOS : 38054 

SOFTWARE: FastSEQ for Windows Version 3.0 
; SEQ ID NO 6918 

LENGTH: 431 

TYPE: DNA 
; ORGANISM: Homo sapiens 

FEATURE: 

NAME /KEY : mis cofeature 
LOCATION: (1) . . . (431) 
OTHER INFORMATION: n = A,T,C or G 
US-09-918-995-6918 

Query Match 41.7%; Score 337.4; DB 10; Length 431; 

Best Local Similarity 99.7%; Pred. No. 1.5e-95; 

Matches 338; Conservative 0; Mismatches 1; Indels 0; Gaps 0, 

472 AAT GAAAC ACAT T TT ACT GGGAACGAAGTT GGTT TTTT CAAGCC CATAT CT T GC CGAAAT 531 

| | I I I I I I I I | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M M I 

1 AAT GAAACACAT T TT ACT GGGAACGAAGT T GGT T T T T T CAAGC CC ATAT CTT GC CGAAAT 60 

532 GTAAATGGCTATTCCTACAAAGTGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGA 591 

| I I I I I I I I I I I I I I I I I I I I I I I i I I I I I M I M I I I I I I I I I M I I II I I I I I I M I 
61 GTAAATGGCTATTCCTACAAAGAGGCAGTCGCATTGTCTCTTTTTCTTGGATGGTTGGGA 12 0 

592 GCAGATCGATTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGG 651 

I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 
121 GCAGATCGATTTTACCTTGGATACCCTGCTTTGGGTTTGTTAAAGTTTTGCACTGTAGGG 18 0 

652 T T TT GT G GAAT T GGGAGC CT AAT T GATT T CAT T CT T ATTT CAAT GCAGAT TGTT GGAC CT 711 

| | I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I II I I I I I I I I I I I I I 
181 T T TT GT GGAATT GGGAGC CTAATT GATTT CATT CTTATT T CAATGC AGAT T GT T GGAC CT 24 0 

712 T CAGAT GGAAGT AGT T ACATT AT AGATT ACT AT GGAAC CAGACTT ACAAGACT GAGT ATT 771 



241 TCAGATGGAAGTAGTTACATTATAGATTACTATGGAACCAGACTTACAAGACTGAGTATT 300 

772 ACTAAT GAAACATTTAGAAAAACGCAATTATATCCATAA 810 

I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I 
301 ACTAAT GAAAC ATT TAGAAAAAC G CAAT TAT AT C CATAA 339 



QY 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 



RESULT 11 

US-10-085-7 83A-36056 

; Sequence 36056, Application US/10085783A 



Publication No. US20040037841A1 
GENERAL INFORMATION: 
APPLICANT: ChondroGene Inc. 
APPLICANT: Liew, C.C. 

TITLE OF INVENTION: Compositions and Methods Relatiing to Osteoarthritis 
FILE REFERENCE: 4231/2002 

CURRENT APPLICATION NUMBER: US/10/085, 783A 
CURRENT FILING DATE: 2002-02-28 
PRIOR APPLICATION NUMBER: US 60/305,340 
PRIOR FILING DATE: 2001-07-13 
PRIOR APPLICATION NUMBER: US 60/275,017 
PRIOR FILING DATE: 2001-03-12 
PRIOR APPLICATION NUMBER: US 60/271,955 
PRIOR FILING DATE: 2001-02-28 
NUMBER OF SEQ ID NOS : 58994 
SOFTWARE: Patentln version 3.2 
SEQ ID NO 36056 
LENGTH: 256 
TYPE: DNA 
ORGANISM: Human 
FEATURE : 

NAME /KEY : mis c_f eature 
LOCATION: (2) . . (2) 

OTHER INFORMATION: n is a, c, g, or t 
FEATURE : 

NAME/ KEY: mis cofeature 
LOCATION: (13).. (13) 

OTHER INFORMATION: n is a, c, g, or t 
FEATURE: 

NAME /KEY: misc_f eature 
LOCATION: (30) . - (30) 

OTHER INFORMATION: n is a, c, g, or t 
US-10-085-783A-36056 

Query Match 12.3%; Score 100; - DB 12; Length 256; 

Best Local Similarity 99.1%; Pred. No. 8.7e-21; 

Matches 111; Conservative 0; Mismatches 0; Indels 1; Gaps 1; 

Qy 700 AT T GT T GGAC CTT C AG- AT GGAAGTAGT T ACAT T AT AGAT T ACTAT GGAAC CAGACT T AC 758 

| I I I I I I I I I I I I I I I I I I I I I I I I I I I II I i I I I I I I I I I I I I I I I I I I I I I M I M I 
D b 14 ATT GTT GGACCTT CAGNAT GGAAGTAGTT ACATT AT AGATTACTAT GGAAC CAGACTTAC 73 

Qy 759 AAGACT GAGT AT T ACT AAT GAAAC AT T T AGAAAAAC GC AATT AT AT CC AT AA 810 

I | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I II 
Db 74 AAGACTGAGTATTACTAATGAAACATTTAGAAAAACGCAATTATATCCATAA 125 



RESULT 12 

US-10-242-535A-36056 

; Sequence 36056, Application US/10242535A 

; Publication No. US20040013663A1 

; GENERAL INFORMATION : 

; APPLICANT: ChondroGene Inc. 

; APPLICANT: Liew, C.C. 

; TITLE OF INVENTION: Compositions and Methods Relatiing to Osteoarthritis 
; FILE REFERENCE: 4231/2005 

; CURRENT APPLICATION NUMBER: US/10/242, 535A 



; CURRENT FILING DATE: 2002-09-12 
; PRIOR APPLICATION NUMBER: US 10/085,783 
; PRIOR FILING DATE: 2002-02-28 
; PRIOR APPLICATION NUMBER: US 60/305,340 
; PRIOR FILING DATE: 2001-07-13 
; PRIOR APPLICATION NUMBER: US 60/275,017 
; PRIOR FILING DATE: 2001-03-12 
; PRIOR APPLICATION NUMBER: US 60/271,955 
; PRIOR FILING DATE: 2001-02-28 
; NUMBER OF SEQ ID NOS : 58994 
; SOFTWARE: Patentln version 3.2 
; SEQ ID NO 36056 

LENGTH: 256 

TYPE: DNA 

ORGANISM: Human 

FEATURE: 

NAME /KEY: misc_f eature 

LOCATION: (2) . . (2) 

OTHER INFORMATION: n is 
; FEATURE : 

NAME/ KEY: mis cofeature 

LOCATION: (13) . . (13) . 

OTHER INFORMATION: n is 

FEATURE: 
; NAME /KEY: misc__f eature 

LOCATION: (30) . . (30) 
; OTHER INFORMATION: n is 
US-10-242-535A-36056 



a, c, g, or t 



a, c, g, or t 



Query Match 12.3%; Score 100; DB 15; Length 256; 

Best Local Similarity 99.1%; Pred. No. 8.7e-21; 

Matches 111; Conservative 0; Mismatches 0; Indels 1; Gaps 1 

Qy 700 AT T GTT GGAC CTT CAG- AT GGAAGTAGT T ACATT ATAGATTACT AT GGAAC CAGACTTAC 758 

I I I I I I I I I I I I I II I I I I I I I I II I I II I I I I I I I I I I I I I I I II I I I I I I I I I II I I 
Db 14 AT T GT T GGAC CTT CAGNAT GGAAGTAGT T ACATT ATAGATTACT AT GGAAC CAGACTTAC 73 



Qy 759 AAGACTGAGTATTACTAAT GAAACATTTAGAAAAACGCAATTATAT CCATAA 810 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 74 AAGACTGAGTATTACTAAT GAAACATTTAGAAAAACGCAATTATAT CCATAA 125 



RESULT 13 

US-10-085-783A-4 8351 

; Sequence 48351, Application US/10085783A 
; Publication No. US2004003784 1A1 

-;-GENERAL— INFORMATION: — 

; APPLICANT: ChondroGene Inc. 
; APPLICANT: Liew, C.C. 

; TITLE OF INVENTION: Compositions and Methods Relatiing to Osteoarthritis 
; FILE REFERENCE: 4231/2002 

; CURRENT APPLICATION NUMBER: US/10/085, 783A 

; CURRENT FILING DATE: 2002-02-28 

; PRIOR APPLICATION NUMBER: US 60/305,340 

; PRIOR FILING DATE: 2001-07-13 

; PRIOR APPLICATION NUMBER: US 60/275,017 

; PRIOR FILING DATE: 2001-03-12 



; PRIOR APPLICATION NUMBER: US 60/271,955 

; PRIOR FILING DATE: 2001-02-28 

; NUMBER OF SEQ ID NOS : 58994 

; SOFTWARE: Patentln version 3.2 

; SEQ ID NO 48351 

; LENGTH: 411 

; TYPE: DNA 

ORGANISM: Human 
US-10-085-783A-48351 

Query Match 10.5%; Score 85.4; DB 12; Length 411; 

Best Local Similarity 98.9%; Pred. No. 4.8e-16; 

Matches 86; Conservative 0; Mismatches 1; Indels 0; Gaps 0 

Qy 724 AGTT ACATT AT AGATT ACT AT GGAACCAGACTT ACAAGACT GAGTAT TACTAAT GAAACA 783 

| | | | I I I I M I I I I I I I I I I I M I I I I I I I I I I I I 

Db 1 AGTT ACATT AT AGATT ACT AT GGAAC C AGACT T ACAAGACT GAGTATT ACTAAT GAAACA 60 

Qy 784 TT T AGAAAAACGCAAT TAT AT C CATAA 810 

I I I I I I I I I I I I I I I I I I I I I I II II 
Db 61 T TT AGAAAAACGCAGT TAT AT C CATAA 87 



RESULT 14 

US-10-242-535A-48351 

; Sequence 48351, Application US/10242535A 

; Publication No. US20040013663A1 

; GENERAL INFORMATION: 

; APPLICANT: ChondroGene Inc. 

; APPLICANT: Liew, C.C. 

; TITLE OF INVENTION: Compositions and Methods Relatiing to Osteoarthritis 
; FILE REFERENCE: 4231/2005 

; CURRENT APPLICATION NUMBER: US/10/242 , 535A 

; CURRENT FILING DATE: 2002-09-12 

; PRIOR APPLICATION NUMBER: US 10/085,7 83 

; PRIOR FILING DATE: 2002-02-28 

; PRIOR APPLICATION NUMBER: US 60/305,340 

; PRIOR FILING DATE: 2001-07-13 

; PRIOR APPLICATION NUMBER: US 60/275,017 

; PRIOR FILING DATE: 2001-03-12 

; PRIOR APPLICATION NUMBER: US 60/271,955 

; PRIOR FILING DATE: 2001-02-28 

; NUMBER OF SEQ ID NOS: 58994 

; SOFTWARE: Patentln version 3.2 

; SEQ ID NO 48351 

; -LENGTH: 411 

-; TYPE:" DNA 

; ORGANISM: Human 
US-10-242-535A-48351 

Query Match 10.5%; Score 85.4; DB 15; Length 411; 

Best Local Similarity 98.9%; Pred. No. 4.8e-16; 

Matches 86; Conservative 0; Mismatches 1; Indels 0; Gaps 0 

Q y 724 AGTTACAT TAT AGATT ACT AT GGAAC CAGACTTACAAGACT GAGTATTACTAATGAAACA 783 

| | | I I I I I I II I I I I M I I I I I M I I I I I I I I I I M I I I M I I I I I I I I I M I I I I I I I I 
Db 1 AGTTACATTATAGATTACTATGGAACCAGACTTACAAGACTGAGTATTACTAATGAAACA 60 



Qy 784 T TT AGAAAAAC GCAAT TAT AT C CATAA 810 

I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 61 T T T AGAAAAACGCAGT TAT AT C CATAA 87 



RESULT 15 

US-10-085-783A-16414 

; Sequence 16414, Application US/10085783A 

; Publication No. US20040037841A1 

; GENERAL INFORMATION: 

; APPLICANT: ChondroGene Inc. 

; APPLICANT: Liew, C.C. 

; TITLE OF INVENTION: Compositions and Methods Relatiing to Osteoarthritis 
; FILE REFERENCE: 4231/2002 

; CURRENT APPLICATION NUMBER: US/10/085, 783A 

; CURRENT FILING DATE: 2002-02-28 

; PRIOR APPLICATION NUMBER: US 60/305,34 0 

; PRIOR FILING DATE: 2001-07-13 

; PRIOR APPLICATION NUMBER: US 60/275,017 

; PRIOR FILING DATE: 2001-03-12 

; PRIOR APPLICATION NUMBER: US 60/271,955 

; PRIOR FILING DATE: 2001-02-28 

; NUMBER OF SEQ ID NOS : 58994 

; SOFTWARE: Patentln version 3.2 

; SEQ ID NO 16414 

LENGTH: 129 

TYPE: DNA 
; ORGANISM: Human 
US-10-085-783A-16414 

Query Match 9.8%; Score 79; DB 12; Length 129; 

Best Local Similarity 89.5%; Pred. No. 2.3e-14; 

Matches 85; Conservative 0; Mismatches 10; Indels 0; Gaps 0 

q y 716 AT GGAAGT AGTT ACAT T AT AGAT T ACTAT GGAACCAGACT T ACAAGACTGAGT AT TACT A 775 

M I II I I I I I I I I I I I I I I I I II II I I I II I I I II I I I I I I I I I II I I I I I I I I 

Db 1 AT GGAAGT AGTT ACATT AT AGATT ACTAT GGAAC CAGACT T ACAAGACTT AGT AT TACT A 60 

Qy 77 6 AT GAAACATTTAGAAAAACGCAATT ATAT CCATAA 810 

I M II I I I I I II I I I I I I I it I I I I I 
Db 61 AGT GAAAC AT TT AGAAAAC GCAATT ATAT C CATAA 95 



Search completed: March 4, 2004, 10:22:21 
Job time : 351 sees 



